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I . INTRODUCTION 


I. 1 Motivation 

One  routing  strategy  frequently  used  in  computer  networks  assigns 
traffic  dependent  lengths  to  the  links  of  the  network  and  then  apportions 
the  traffic  among  the  various  paths  depending  on  their  relative  distances. 
(The  distance  of  a path  is  the  sum  of  the  lengths  of  its  links) . 
Periodically,  the  lengths  are  updated  to  reflect  changes  in  the  traffic, 
path  distances  are  recomputed,  and  traffic  is  appropriately  rerouted. 

Thus,  the  strategy  is  somewhat  adaptive.  Implementation  of  this  strategy 
requires  that  four  basic  subproblems  be  addressed. 

1)  Dependence  of  link  lengths  on  the  traffic 

2)  Computation  of  path  distances 

3)  Allocation  of  traffic 

4)  Frequency  of  updating. 

Parts  1)  and  3)  are  generally  handled  by  choosing  seme  cost  function, 
assigning  to  each  link  a length  that  reflects  the  cost  of  using  it,  and 
then  routing  traffic  so  as  to  minimize  the  total  cost.  Methods  for 
doing  these  things  involve  considerations  of  multicommodity  flow  problems 
and  the  statistics  of  network  traffic.  They  present  formidable  problems 
because 

a)  it  is  not  always  obvious  how  to  choose  a meaningful 
(in  terms  of  network  performance)  cost  function  and, 

b)  given  a cost  function,  it  is  not  easy  to  see  how  to 
assign  link  distances  in  an  appropriate  manner  and. 


c)  given  the  link  distances,  the  opt.  al  traffic 
allocation  pattern  can  be  difficult  to  find. 

Much  of  the  previous  work  done  on  routing  has  focused  on  these  problems 
and  the  reader  is  referred  to  [1]  and  [2]  for  representative  studies. 
Limitations  of  the  computational  resources  directly  affect  2)  and  hence 
the  whole  strategy.  Since  the  total  number  of  paths  can  grow  exponen- 
tially with  the  number  of  nodes,  it  is  often  not  feasible  to  compute  all 
path  distances.  Frequently,  only  the  shortest  paths  are  computed  and 
the  amount  of  traffic  flow  allocated  to  these  paths  is  increased  relative 
to  the  old  values.  The  frequency  with  which  updates  can  be  performed 
depends  on  how  much  of  the  network's  resources  the  updating  procedure 
uses. 

If  a central  facility  monitors  all  network  traffic,  2)  is  purely 
a computational  problem,  and  there  are  several  well  known  algorithms 
for  computing  shortest  paths.  If  traffic  is  only  locally  monitored, 
these  algorithms  can  still  be  used  provided  all  other  nodes  send  their 
local  information  to  the  central  computer.  This  is  a communication 
problem.  In  either  case,  the  network  is  completely  dependent  on  the 
central  facility  for  purposes  of  routing.  Hence,  it  is  desirable  to 
have  procedures  in  which  the  nodes  begin  with  only  local  information  and 
compute  path  distances  by  communicating  with  one  another.  If  a node  or 
link  is  not  functioning,  it  is  automatically  excluded  from  the  update, 
but  other  nodes  proceed  with  the  remaining  links. 
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This  thesis  presents  and  analyzes  several  such  distributed 
shortest  path  algorithms,  and  hence  bears  on  part  2)  of  this  strategy. 

The  link  lengths  are  assumed  to  be  already  assigned  and  are  taken  as 
part  of  the  input.  Each  node  initially  knows  the  lengths  of  its  outgoing 
arcs  and  possibly  some  things  about  the  topology  of  the  network.  It 
finds  shortest  paths  to  other  nodes  by  communicating  with  them.  It  is 
assumed  that  the  arc  lengths  and  topology  are  all  fixed  during  execution. 
The  problem  of  routing  in  the  presence  of  failures  is  indeed  significant, 
but  here  we  are  only  concerned  with  the  amount  of  information  that  the 
nodes  must  exchange  in  order  to  find  shortest  paths  in  a fixed  graph. 

The  problem  is  discussed  abstractly  in  that  protocol  issues  are  ignored. 
It  is  assumed  that  there  is  a protocol  enabling  nodes  to  communicate 
reliably  with  one  another,  and  we  count  only  the  information  relevant 
to  the  algorithm.  For  example,  if  one  node  sends  to  a neighbor  the 
length  of  some  link,  we  count  only  the  number  of  bits  needed  to  encode 
this  length,  even  though  this  "message"  will  have  protocol  bits  appended. 
This  abstraction  of  communication  cost  is  analogous  to  that  made  in  the 
study  of  centralized  algorithms.  Elementary  operations  such  as  addition 
or  comparison  are  often  assigned  unit  cost  since  they  only  require  seme 
bounded  number  of  machine  language  instructions  to  implement.  Similarly, 
we  view  protocol  as  a fixed  overhead  and  are  interested  in  how  the 
abstract  communication  cost  grows  as  a function  of  the  size  of  the 


network. 
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1.2  Notation  and  Definitions 

For  our  purposes,  a network  is  represented  by  a directed  graph  or 
digraph  G=(N,A),  where  N is  a set  of  vertices  or  nodes  labelled 
l,...,n  and  A C NxN  is  a set  of  arcs.  An  arc  free  node  i to  node  j is 
denoted  by  (i,j).  We  assume  all  arcs  are  nontrivial,  i.e. 

(i,j)  £ A »>  iftj.  The  set  {j  |(i,j)  EaI  is  termed  the  adjacency  list  of 
i,  Ab(i) , and  a node  j £ At, (i)  is  called  a neighbor  of  i.  The  cardinal- 
ity of  AL(i) , |AL(i) [,  is  called  the  outdegree  of  i,  OD(i) . Similarly, 
the  indegree  of  node  i,  ID(i)  , is  defined  as  | { (k,i) | (k,i)e  a} | . Note 

1 OD(i)  = yiD(i)  = ( A | . G can  be  naturally  associated  with  an  undirected 
i i“ 

graph,  G=(N,L) , where  N is  the  same  set  of  nodes  as  in  G,  and  t is  a 

set  of  undirected  edges  or  links , <i,j>  . We  take  <i,j>  £ I,  if  either 

(i,j)E  A or  (j,i)£  A inclusively.  If  (i , j ) £ A and  {j,i)£  A,  we  still 

take  only  one  copy  of  <i,j>  . The  degree  of  node  i in  G,  D (i) , is 

defined  as  | {<i , j> |<i, j>£  L}|.  Note  that  <i,j>  and  <j,i>  are  the  same 

element  . £ D(i)  = 2L  since  a link  is  counted  at  both  endpoints, 
i 

A path  in  G from  i to  j , P[i,j]  is  a finite  sequence  of  vertices 
[i-i  , i2, . . . ,ik*j]  , such  that  (i^,  i^+1)£  A,  l££.<k-l.  The  arcs 
(i^,  i^+^)  are  said  to  be  in  the  path.  If  i*=j  and  all  other  nodes  are 

distinct  and  different  from  i then  Pfi, j]  is  a cycle.  A path  P[r,t} 

» Ir»i,,...,j  *t]  is  a suboath  of  P[i,j]  if 
i s * 


1)  P[r,t]  is  a subsequence  of  P[i,j]  and 

2)  (x,y)  is  an  arc  in  P[r,t]  only  if  (x,y)  is  an  arc  in  P[i,j] 


Path  can  be  analogously  defined  for  undirected  graphs  in  an  obvious 
manner.  A directed  graph  G is  called  strongly  connected  if  there  is  a 
path  from  i to  j , Vi^j.  G is  called  weakly  connected  or  just  connects  1 
if  there  is  a path  in  G from  i to  j , Vi/j.  Note  that  in  G,  if  there 
is  a path  from  i to  j then  there  is  also  one  from  j to  i since  the 
edges  have  no  direction.  This  is  not  true  in  G.and  hence  we  distinguish 
strongly  connected  from  connected.  The  diameter  of  G,  D (G)  is  defined  as 
max(rr.in  # of  links  in  a path  from  i to  j in  G) . This  means  that  for  any 

i#  j 

i,j,  there  is  a path  from  i to  j in  G using  at  most  D(G)  edges. 

A weighted  digraph  G=(N,A,£)  is  a digraph  together  with  a 
function  Z:  NxN  -*•  and  f,(i,j)  is  called  the  length  of  arc  (i,j). 

We  take  £(i,i)=0  Vi  e N,  £(i, j) < 00 , V(if  j)  £ A,  and  l(i,j)  = 00  Y(i, A. 
That  is,  for  shortest  path  problems,  we  consider  those  arcs  in  the  graph 
to  have  finite  length  and  missing  arcs  to  have  infinite  length.  The 


k-1 

length  of  a path  P [i , j ] = [i. , . . . ,i,  ] is  defined  as  J Hi  , i , ) . 

lk  , s s+1 

s*l 

We  will  frequently  consider  asymptotic  rates  of  growths  of  functions. 

To  express  this  notion  in  compact  form,  we  introduce  the  "big-oh" 

notation.  A function  f(n)  is  said  to  be  0(g(n))  (order  g (n)  or  big-oh 

of  g(n))  if  there  exists  an  n and  a constant  c such  that 

o 


' ■ -n 

“"T 

II. 3 Problem  Formulation 

(Much  of  the  material  in  the  remainder  of  this  chapter  and  in 
the  next  chapter  is  drawn  from  [3]  and  [4] . They  are  excellent  ref- 
erences for  the  reader  who  is  unfamiliar  with  graphs  or  the  shortest  | 

path  problem) . 

f 

Given  a weighted  digraph  there  are  3 shortest  path  problems 
we  consider. 

1)  All  Pairs  (AP)  - Find  the  length  of  a shortest  path  from 

i to  j.  Vi*j. 

2)  Single  Source  (SS)  - Find  the  length  of  a shortest  path 

from  a source  node  i to  all  other  nodes. 

3)  Single  Pair  (SP)  - Find  the  length  of  a shortest  path 

from  a given  source  to  a given  desti- 
nation. 

In  practice,  one  may  actually  want  the  path  or  just  the  length.  We 
tend  to  such  details  when  necessary. 

Assume  for  the  moment  that  G is  strongly  connected.  If  G has 
cycles  of  negative  length,  then  there  is  no  shortest  path  from  i to  j 
for  any  choice  of  i and  j.  Otherwise,  all  of  the  above  problems  have 
well  defined  solutions,  and  there  is  a cycle  free  shortest  path  between 
any  two  nodes.  (Cycles  of  zero  length  can  be  deleted).  One  of  the 
basic  observations  concerning  the  shortest  path  problem  is  the  following 
"Principle  of  Optimality": 

I 

\ I 

' I 

< 

1 


r 


1 


f 

! ! 


' t. 
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If  P[i(j]  is  a shortest  path  from  i to  j then  any  subpath 
P[r,t]  must  also  be  a shortest  path  from  r to  t. 

From  this  principle,  we  readily  obtain  the  following  necessary  conditions 
for  the  shortest  path  path  lengths  in  a weighted  digraph  having  no 
negative  length  cycles. 

Let  D(i,j)  ■ length  of  a shortest  path  from  i to  j. 

Then 

D(i,i)  - 0 

D(i,j)  ■ min (D (i ,k)  + £(k,j)) 

W 

These  are  called  Bellman’s  Equations  (BE).  It  is  shown  in  [4],  that  if 
G has  no  nonpositive  cycles,  then  BE  are  uniquely  satisfied  by  the 
shortest  path  lengths.  If  G has  zero  length  cycles,  then  BE  may  have 
solutions  other  than  the  shortest  path  lengths.  In  practice  however, 
computational  procedures  find  the  shortest  path  lengths  solutions  even 
if  G has  zero  length  cycles. 
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1.4  Shortest  Path  Trees 

A digraph  containing  no  cycles  is  called  a directed  acyclic  graph 
or  dag  for  short.  A rooted  tree,  T=(M,B),  is  a dag  satisfying 

1)  There  is  exactly  one  vertex  called  the  root  with  indegree  = 0. 

2)  Every  other  vertex  has  indegree  * 1. 

3)  There  is  a path  from  the  root  to  every  other  vertex. 

Note  that  2)  implies  that  the  path  in  3)  is  unique,  and  one  can  also 
show  that  B=M-1 . Because  T is  acyclic,  there  is  at  least  one  vertex 
having  outdegree  = 0,  and  any  such  vertex  is  a leaf.  The  depth  of  a 
vertex  j in  T is  defined  as  the  number  of  arcs  in  the  path  from  the 
root  to  j . The  depth  of  the  root  is  0 , and  the  depth  of  the  tree  is 
defined  as  max  (depth(  j)  | je.M) . Examples  of  trees  and  dags  are  shown 
below. 


Now  suppose  G=(N,A)  is  a weighted  digraph  in  which  every  cycle  has 


positive  length.  Define  DO(i)  to  be  the  set  of 


s such  that 


other  vertex.  Let  DSPO(i)  be  the  graph  (N,DO(i)).  By  using  the  prin' 


ciple  of  optimality  and  the  fact  that  all  cycles  in  G have  positive 


length,  one  can  show  that  the  graph  DSPO(i)  is  a dag- (DSPO(i)  stands 


for  dag  of  shortest  paths  out  of  i) . DSPO(i)  is  not  always  a tree 
because  there  can  be  two  or  more  different  paths  from  i to  some  other 


vertex  that  have  equal  lengths 


DSPOU) 


By  arbitarily  choosing  only  one  of  the  paths  (in  the  example,  delete 
either  (Z, j)  or  (m,j))  we  can  change  the  graph  into  a tree. 


If  we  do  this  for  all  vertices  in  DSPO(i)  having  indegree  >1,  we  can 
construct  a tree  rooted  at  i in  which  the  path  from  i to  j in  the  tree 
is  a shortest  path  from  i to  j in  G.  We  call  this  a tree  of  shortest 
path  out  of  i,  TSPO(i).  By  considering  the  set  of  arcs  in  G that  are 
in  shortest  paths  from  all  other  vertices  to  a vertex  i,  one  obtains 
analogous  structures  - shortest  path  trees  and  dags  into  i.  We  call 
these  DSPI(i)  and  TSPI(i)  (TSPI(i)  is  a graph  satisfying  the  definition 
of  rooted  tree  with  the  roles  of  indegree  and  outdegree  reversed.  Thus 
it  is  a reverse  rooted  tree,  but  we  still  call  it  a tree).  If 
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i 
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< 


V(i,j)£A  then  a shortest  path  is  called  a minimum  hop  path 
for  obvious  reasons.  In  this  special  case,  we  call  these  trees  and 
dags  MHT(D)I(0) (i)  for  minimum  hop  tree  (dag)  into  (out  of)  i.  These 
tree  and  dag  structures  play  an  important  role  in  shortest  path  algorithms. 


! 


d 
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II.  CENTRALIZED  ALGORITHMS  FOR  THE  SHORTEST  PATH  PROBLEM 
II. 1 Single  Source  Problem 

It  is  perhaps  a curious  fact  that  there  is  no  known  algorithm  for 
solving  the  SP  problem  that  in  effect  doesn't  solve  the  SS  problem  at 
the  same  time.  In  this  section,  three  algorithms  for  solving  the  SS 
problem  are  presented. 

Dijkstra's  Algorithm 

This  algorithm  works  under  the  assumption  that  £{i,j)>0  V(i,j)EA. 
We  take  node  1 as  the  source. 

D (1, 1)  -*■  0 
D (1 , j ) - Mi,j) 

P «-  {l}, 

T {2, . . . ,n} 

Step  1:  Designation  of  the  set  P of  Permanent  Labels 

Find  k £ T s.t.  D(l,£)  * min [D( | , j) | j£T] 

T T - {k} 

P «-P  U {k} 

If  T— <J>  stop.  Else  Go  To  Step  2. 

Step  2:  Revision  of  Tentative  Distances  to  Nodes  in  T 
Vj  £ TnAL(k) do:  D (1 , j }■•-  min{D(l , j ) , D(l,k)  + £(k,j)} 


Go  To  Step  1. 
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The  m — time  step  1 is  executed,  at  most  n-m-1  comparisons  are  needed  to 

find  the  minimum.  Thus  step  1 requires  a total  of  0(n  ) comparisons. 

Each  arc  in  A appears  at  most  once  as  part  of  an  addition.  Thus  the  total 
2 2 

cost  is  0(n  + A)  , n comparisons  and  0(A)  additions. 

Proof  of  Correctness 

Claim?  Let  * ^l""‘'^n-m^  an<^  Pm  sets  T an<^  p *b  bhe  beginning 

of  the  m^-  iteration.  Let  Dm(l,jk)  be  the  value  of  Dfl.j^)  at  that  tine, 

1 £ k 1 n-m*  Assume  (without  loss  of  generality)  that  node  j is  the  node 

in  T chosen  in  the  first  line  of  step  1.  Then 
m 

Dm ^ ^ k ^ is  the  len9th  of  a shortest  path  from  1 to  j s.t.  all 

nodes  except  j are  in  P ; for  all  1 < k < n-m. 

k in  — — 

2)  is  in  fact  a shortest  path  length  (unconstrained). 

Proof? 

Part  1)  follows  immediately  from  the  fact  that  at  each  iteration, 

D(l» j) , is  updated  with  the  smaller  of  the  old  value  of  D(l,j)  and  the 

path  length  D(l,k)  + fc(£,j)  where  k was  just  marked  permanent. 

To  prove  part  2),  assume  that  there  is  some  other  path  g[l,j  J 

of  length  D(l,j  ) s.t.  D(l,j.)<  D (l,j).  Let  node  x be  the  first 
a rmi  

node  in  P [1 , j 1 that  is  in  T (note  j,  eT  =>x  exists,  and  node  1 
x m x in 

C Pm  ->  tjet  ^ f 1 . x3  be  the  subpath  of  P f 1 , j that  goes  from  1 

to  x and  let  the  length  of  P[l,x]  be  D(l,x).  Then  we  have 
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D (l,x)<  D ( 1 , x)  by  part  1 of  the  claim  since  all  nodes  in 
zn  — 

P[l,x]  except  x are  in  p 

m 

^ D(l, j^)  since  all  arclengths  are  0. 

< D(l,j^)  by  assumption. 

Hence  we  have  D (l,x)<  D (l,j_).  If  x=j_,  this  is  a contradiction, 
m ml  1 

If  x^j^,  this  means  would  not  have  been  chosen  in  step  1.  Again 
contradiction,  j | 

Thus  Dijkstra's  algorithm  works  by  growing  a shortest  path  tree. 
It  finds  such  a tree  for  nodes  i , say,  such  that 

D(l,i  )£  D(l, i ) . . .<_  D(l,i  ) and  shortest  paths  to  regaining  nodes 
have  lengths  D(l,i^).  It  then  considers  one  cure  extensions  of  this 
tree  and  labels  the  closest  unlabelled  nodes.  (See  example  below): 


o,  * l cute  1 1 €-4 

v laUellec) 


Observe,  however,  that  it  considers  all  one  arc  extensions  from  a certain 
node  as  soon  as  it  is  labelled.  The  set  of  tentative  distances,  therefore, 
need  not  be  in  any  order,  and  so  it  can  take  n-m-1  comparisons  to  find  the 


minimum  in  the  worst  case  at  the  m — stage.  By  considering 

shortest  one  cure  extensions  only  one  at  a time,  we  cam  achieve  an 

2 

0 (A  log  n)  comparisons  algorithms.  This  is  better  than  0(n  ) if 
_ 2 

A <<  n , i.e.  sparse  graphs.  This  modification  of  Dijkstra's  algorithm 
is  due  to  Spira  [5] . 

To  explain  this  procedure,  we  introduce  the  notion  of  played 
binary  trees.  An  undirected  graph  T is  a binary  tree  if  it  is  the 
undirected  graph  associated  with  a rooted  tree  T having  the  property 
that  all  vertices  in  T,  other  than  the  leaves,  have  outdegree  = 2.  If 
all  leaves  are  at  the  same  depth  the  binary  tree  is  complete. 


co**pVeAc  C 0w\  ple.4-e 


Notice  that  a binary  tree  with  n leaves  must  have  at  least  depth 


Binary  trees  are  useful  for  extracting  minima  of  sets.  To  find 
the  minimum  of  a set  of  n numbers,  first  construct  a complete  binary 

M 

tree  with  2 leaves.  Place  the  numbers  on  various  leaves  and 

perform  successive  comparisons  going  up  the  tree. 


This  is  called  playing  the  tree  and  n-1  comparisons  are  required  to 
find  the  minimum.  After  doing  this,  we  can  find  the  next  smallest 
element  in  0(log  n)  comparisons.  We  erase  the  path  taken  by  the  winner 
and  then  replay  that  part  of  the  tree.  Since  it  has  depth  0(log  n) , 
only  0(log  n)  comparisons  are  required. 


/o  2 


We  obtain  a savings  over  the  naive  method  (performing  n-2  comparisons 
on  the  n-1  remaining  elements)  because  we  only  have  to  redo  comparisons 
among  elements  that  lost  to  the  minimum  the  first  time.  This  leads  us 
to  the  following  lemma. 


Lemma;  There  is  an  algorithm  to  find  the  k smallest  elements  of  a set 


in  n-1  + (k-1) 


[log  n] 


comparisons. 


Spira's  algorithm  uses  a variation  of  this  idea. 


Lemma 


Let  {s^},l<i<k  be  a family  of  sets  such  that 


1)  |s1|  = 1 

2)  S.  = S.  , - {min  S.  , } + one  or  two  new  elements. 

i i-l  i-l 

Then  there  is  an  algorithm  to  successively  find  the  minimum  of 
S, , S0, . . . ,S^  in  2k  [log  n|  comparisons  provided  |s,J<  n. 
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We  sketch  the  proof  since  it  should  be  pretty  obvious.  Construct 
a complete  binary  tree  with  2 leaves.  Place  the  element  in 

at  a leaf  and  play  the  tree  ( ["log  n"|  steps)  . Erase  the  path  taken  by 
this  element  and  place  one  new  element  in  at  the  leaf  where  the 
element  of  was.  If  there  is  another  new  element  in  S2»  place  it  at 
some  other  leaf.  Replay  the  tree.  The  procedure  continues  in  an  obvious 
manner. j | 

S,- 

s5-  £ t,  f , 12} 


Now  we  present  the  algorithm.  We  assume  that  we  are  given  an 
arclength  list  for  each  node  i,  and  we  call  this  list  AbLST(i).  Node 


1 is  taken  to  be  the  source. 


Di  jkstra-Soira  Algorithm 


1.  For  i-1  to  n DO:  SORT  ALLST(i)  into  increasing  order. 

2.  For  j-2  to  n DO:  LABEL (j)  * TENTATIVE 

3.  T-»-{2,  — ,n} 

4.  D(l,l)-0 

5.  •*-  {D(l,l)  + min  ALLST  ( 1 ) } . 


COMMENT: 

The  sets  will  consist  of  distances  of  various  paths  from  node  1 

to  other  nodes.  Each  such  distance  will  be  stored  in  the  form 
D ( 1 , p)  + i(p,k)  since  we  will  need  the  identity  of  the  node  P which  is 
the  next  to  last  vertex  in  the  path  from  1 to  k having  distance 
D(l,p)  + £,(p,k).  Though  we  don't  explicitly  give  the  data  structure  for 
doing  this,  we  observe  that  storing  a distance  in  this  two-part  manner 
only  adds  a constant  amount  of  work  over  just  storing  the  distance  as 
one  number. 


6.  i «-  1 

7.  Let  D(l,p)  + £(p,k)  = min{D(l,p)  + £ (p,k)  j(D (1  ,p)  + Mp.kfleS^ 

8.  ALLST (p)^  ALLST (p)  -{min  ALLST (p) } . 

COMMENT  ON  8:  The  minimum  of  ALLST (p)  at  this  point  is  just  fc(p,£) 


and  since  we  examined  it  in  7,  we  remove  it. 


> 
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9.  Si+i  Si  “ + '•(p.k)  }U{D(l,p)  + min  ALLST(p)} 

COMMENT  ON  9:  We  revise  our  set  of  distances  by  deleting  the  one  just 
examined  in  7 and  adding  the  distance  of  the  next  best  path  that  is  a 
one  arc  extension  from  p. 

10.  IF  LABEL (k)  = TENTATIVE  THEN 

DO  i 

LABEL (k)  = PERMANENT 
T «-  T - {k} 

D(l,k)  - D ( 1 ,p)  + i(p,k) 

Si+1  Si+1U{D(l,k)  + min  ALLST (£) } 

END; 

* 

COMMENT  ON  10:  If  the  distance  in  7 was  to  a tentative  node,  then  it 

is  a shortest  distance.  We  mark  the  node  as  permanent  and  add  to  the 
set  of  candidate  distances  the  distance  of  the  next  best  path  that  is 
a one  arc  extension  from  k. 

11.  i «-  i+1 

12.  If  W GO  TO  7 

13.  ELSE  STOP. 

The  proof  of  correctness  of  this  procedure  should  be  clear  from 
previous  discussions.  It  is  basically  Dijkstra's  algorithm  except 
extensions  of  paths  from  labelled  nodes  are  considered  one  at  a time. 
This  allows  the  minimum  to  be  found  in  an  efficient  manner  in  step  7. 


PM 
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Sorting  a list  of  q numbers  can  be  done  in  c-q  log  q comparisons  for 

some  constant  c.  Thus,  the  arclength  list  ALLST(i)  can  be  sorted  in 

n 

c*OD(i)  log(OD(i))  comparisons  =>  step  1 requires  c £ OD (i) log (OD (i) ) 

i=l 

comparisons.  Now  OD(i)  <_  n and  so  step  1 requires  at  most 
n 

c-  log  n J OD(i)  = c-A*log  n comparisons.  Step  7 requires  0(log  n) 
i=l 

comparisons  to  find  the  minimum  because  of  the  lemma,  and  it  is  executed 
at  most  once  for  each  arc  in  A.  Thus  the  overall  cost  is  0(A  log  n) . 

Bellman-Ford  Method 

If  there  are  negative  arclengths  then  the  inductive  character 
of  Dijkstra's  algorithm  breaks  down  since  a path  can  have  a smaller 
length  than  its  subpaths.  However,  as  long  as  there  are  no  negative 
length  cycles,  there  will  exist  cycle  free  shortest  paths.  The  following 
procedure  solves  the  SS  problem  when  negative  arclengths  are  allowed 
provided  there  are  no  negative  length  cycles.  It  is  0(n3),  however, 
since  in  effect  it  considers  extensions  of  paths  from  all  nodes  rather 
than  extensions  of  known  optimal  paths.  Node  1 is  again  taken  as  the 
source. 

D(1,1,1)=0 

D(l,j,l)  = 1(1, j)  (Recall  1(1, j)  = ” if  j$AT,(in 


m *■  1 


Step  1;  D(l,  j ,m+l)-«-  minfDd,  j ,n)  , min{D  (1  ,k,m)  ♦ v(k,j)}] 

kfj 

Step  2;  IF  D(l,j,m+1)  *0(l,j,tn)V"  then  STOP. 

ELSE  DO:  m-*m+l  Go  to  Step  1. 

An  intuitive  proof  of  correctness  can  be  seen  in  the  following 
interpretation  of  D(l,j,m): 

D(l,j,ra)  **  the  length  of  a shortest  path  from  1 to  j that 
uses  m or  fewer  arcs. 

(the  reader  can  easily  verify  this  interpretation) . Since  there  is  a 

shortest  path  using  n-1  or  fewer  arcs,  D(l,j,n-1)  r.ust  be  a shortest 

distance,  vj.  If  the  test  in  step  2 yields  a true  result,  for  n<n-l, 

then  we  are  fortunate  to  have  early  convergence.  To  analyze  the  cost 

2 

observe  that  each  execution  of  step  1 requires  0(n  ) comparisons  and 
additions.  Since  step  1 can  be  executed  0(n)  tines  in  the  worst  case 
the  algorithm  requires  0(n3)  comparisons  and  additions. 

We  observe  that  the  algorithm  can  be  modified  so  that  in  step  1, 
only  the  additions  D(l,k,m)  + £(k,j)  for  (k,j)e  A are  performed. 

In  this  case,  each  execution  of  step  1 requires  a total  of  0(A)  ad- 
ditions and  comparisons  and  overall  the  algorithm  is  0(An).  This  of 
course  can  be  O(n^). 


II. 2 All  Pairs  Problem 


Let  P and  Q be  two  real  nxn  matrices.  Define  the  min/plus 
product  * of  P and  Q,  R=P*Q  by 


Rij  ■ ”£n(pik  ^kj’ 


If  L is  a matrix  of  arclengths  £{i,j) 


then  a little  work  shows  that 


L*(L*( *L)...) 

m times 

is  a matrix  of  shortest  path  lengths  subject  to  the  constraint  that 
the  paths  have  m or  fewer  arcs.  One  can  show  that  * is  associative, 
and  therefore,  L°  \ a matrix  of  shortest  path  lengths,  can  be  computed 
by  performing  log2(n-l)  successive  squaring  operations.  Each  squaring 
can  be  done  in  O(n^)  additions  and  comparisons  and  so  we  obtain  an 
0(n^  log  n)  algorithm. 

There  is  in  fact  a clever  labelling  algorithm  due  to  Floyd  and 
Warshall  that  solves  the  AP  problem  in  O(n^)  comparisons  and  additions. 

Let  D(i,j,m)=  the  length  of  a shortest  path  from  i to  j subject  to 
the. constraint  that  all  nodes  other  than  i and  j are  not  in  the  set 
{m,m+l, . . . ,n}.  Then  D(i,j,n+1)  is  a shortest  distance  between  i and  j, 
(Again  we  assume  no  negative  length  cycles) . Now  a shortest  path  from 
i to  j that  has  no  intermediate  rodes  in  the  set  {m+l,...,n}  either  a) 
does  not  visit  m,  in  which  case  D(i,j,m+1)  = D(i,j,n)  or  b)  visits  m in 
which  case  D(i,j,m+1)  = D(i,m,m)  + D(m,j,m).  Hence  we  can  compute 
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I 

,! 


, 


M 


D(i,j,n+1)  as  follows. 

1.  D(i,j,l)  - 

2 . m*-l 

3.  D(i , j ,m+l)+-  min(D(i,  j ,m)  , D(i,m,m)  +D(m,j,m)) 

4.  m+-m+l 

5.  IP  m-n+1  STOP.  ELSE  GO  TO  3. 

Note,  negative  arclengths  are  allowed  as  long  as  there  are  no  negative 
length  cycles.  There  are  exactly  n(n-l) (n-2)  equations  to  solve,  i^j 
and  m^i  and  m/j  and  hence  the  algorithm  is  0(n3).  This  is  the  same 
order  of  complexity  as  the  Bellman-Ford  algorithm  yet  here  we  find  all 
n(n-l)  shortest  paths.  The  AP  problem  has  an  algebraic  flavor  (matrices) 
that  is  not  present  in  the  SS  problem.  Observe,  though,  that  if 
Mi,j)>0  Vi/j  then  we  can  also  achieve  an  0(n3)  algorithm  for  the  AP 
problem  by  applying  Dijkstra's  algorithm  n times,  once  for  each  node  as 
a source. 


i 


' 


II. 3 Lower  Bounds 

An  analytic  tree  program  is  a program  that  at  each  step  evaluates 
some  analytic  function  of  the  input  and  then  branches  to  one  of  two 
successive  steps.  Dijkstra's  algorithm  is  such  a jrocedure  since  at 
each  step  it  computes  a linear  function  of  the  inpat  distances  (addition) 
and  then  branches  based  on  a comparison  . Spira  and  Pan  [6]  have  shown 
that  any  analytic  tree  program  which  verifies  that  a weighted  rooted 
tree  with  n vertices  is  a shortest  path  tree  (with  respect  to  some 
weighted  graph  containing  this  tree)  must  use  0(n^))  comparisons  in  the 
worst  case.  Thus  Dijkstra's  algorithm  is  essentially  optimal  in  the 
class  of  algorithms  that  use  comparisons  among  sura  of  arclengths  to 
compute  a shortest  path  tree.  An  interesting  open  problem  is  to  find 
an  0(A)  algorithm  for  the  SS  problem  when  the  graph  is  sparse,  i.e. 

A«n2. 

It  has  also  been  shown  that  computing  the  matrix  product  * is 
of  the  same  order  of  complexity  as  solving  the  AP  problem,  when  the 
matrix  entries  and  arclengths  are  nonnegative.  Th®t  is,  computing 
P*Q  is  of  the  same  difficulty  as  computing  Ln  ^ far  nonnegative 
reals.  The  reader  is  referred  to  13]  for  a proof,  references  to  the 
original  work,  and  a more  general  formulation  of  the  minimum  cost 
path  problem  and  its  relationship  to  computing  *.  The  problem  of 
detecting  negative  cycles  is  discussed  in  [4]  . Kerr  [7]  has  shown 
that  if  the  only  permissible  operations  are  p+q  ani  min(p,q)  then 
O(n^)  operations  are  required  to  compute  P*Q  in  the  worst  case. 


I 


T" 
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III.  DISTRIBUTED  SHORTEST  PATH  ALGORITHMS 

III.l  More  Notation  and  Assumptions 

Since  G represents  a data  network,  we  hereafter  assume, (unless 
otherwise  noted) 

1)  (i,j)£  A implies  (j,i)£  A.  (links  are  duplex) 

2)  G is  connected 

3)  i(i,j)>0  (i,j)£  A.  (Conventional  routing  algorithms  generally 

assign  positive  costs  to  the  links) . 

A directed  graph  satisfying  condition  1 is  termed  symmetric.  However, 
it  is  not  assumed  that  £. ( j , i ) = £(i,j),  because  in  general  the  cost 
of  sending  a certain  flow  over  the  path  [i,j]  will  sot  be  the  same  as 
sending  an  equal  amount  of  flow  over  the  path  [ j , i] . Note  that  1)  and 
2)  imply  that  the  directed  graph  G is  strongly  connected. 

An  undirected  graph  G=(N,L)  is  called  bipartite  if  3N^  and  such 
that  N^  ON2=4>,  N^  U N2=N  and  <i,j>  £ L implies  eitfcer  i£N^,  j£N2»  or 
i£N^  , jcN^.  Intuitively,  a bipartite  graph  is  one  that  can  be  made  to 


look  like  this 


j 
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for  some  appropriate  partition  of  the  set  N into  and  • Our 
interest  in  bipartite  graphs  will  become  apparent  in  III. 2, 

We  assume  that  at  each  node  of  the  network,  there  is  a computer 
capable  of  performing  the  usual  operations  such  as  addition,  subtrac- 
tion, comparison,  branching,  and  data  storage  and  retrevial. 
Additionally,  we  assume  that  there  is  a protocol  enabling  reliable 
node  to  node  communication.  The  communication  cost  of  an  algorithm 
will  be  taken  as 


message  cost  in  bits  x # arcs  traversed 
* by  message 


all  messages 
occurring  while 
algorithm  executes 


In  general,  messages  will  consist  of  node  identities  and  arclengths. 

An  element  of  the  set  of  integers  {l,...,M}  can  be  encoded  into  a 
binary  number  with  J” log^ (M)"J  bits.  Note  that  even  if  we  wish  only 
to  encode  the  number  1 as  an  element  of  this  set,  [log^M j bits  are 
still  required  since  we  must  specify  the  end  of  this  message.  Thus  a 
node  identity  can  be  encoded  with  j"log2nj  bits.  If  arclengths  are  all 
integers,  then  one  arclength  can  be  encoded  with  Jlog^ (£-max)"j  bits 
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where  £-majc  ■ max{£(i, j) ] (i, j)e  A}.  If  arclengths  are  allowed  to  be 
arbitrarily  large  integers  or  rational  numbers  with  arbitrary  pre- 
cision, then  the  cost  of  representing  them  is  clearly  unbounded. 

However,  this  is  misleading  since  in  some  sense,  we  wish  to  consider 
the  transmission  of  an  arclength  from  some  node  to  one  of  its  neighbors 
as  1 message.  In  conventional  network,  the  arclengths  tend  to  be 
relatively  small  integers  anyway.  The  communication  cost  of  a distri- 
buted algorithm  X will  be  denoted  by  CC(X) . 

The  algorithms  presented  in  the  remainder  of  the  thesis  will 

require  that  the  nodes  process  and  communicate  various  arclengths  and 
path  distances.  For  notational  convenience,  we  will  denote  arclengths  and 
path  distances  simply  by  £(i,j)  and  d(i,j)  when  presenting  the  algorithms. 
However,  we  will  assume  that  whenever  a node  stores  an  arclength  £(i,j), 
it  does  so  in  a manner  that  allows  it  to  retrieve  the  identities  i and  j 
at  some  later  time.  ( This  can  be  done  by  either  directly  storing  the 
identities  i and  j together  with  the  length  or  using  a data  structure  that 
allows  i and  j to  be  uniquely  determined  from  the  position  of  the  arclength 
in  the  data  structure,  e.g.  mapping  the  pair  (i,j)  into  some  index  in  a 
1-1  manner,)  In  particular,  a node  will  be  able  to  supply  the  identities 
i and  j as  well  as  the  length  if  some  other  node  requests  this  information. 
The  same  will  hold  for  distances  d(i,j). 


We  first  consider  the  problem  of  finding  minimum  hop  paths  since 
it  illustrates  the  salient  features  of  the  operation  and  analysis  of 
distributed  algorithms.  Each  node  i initially  knows  only  its  own 
identity,  and  at  the  completion  of  the  algorithm  knows  all  neighbors 
through  which  it  has  minimum  hop  paths  to  k,  for  each  k^i.  It 
doesn't  know  the  entire  path,  but  then  this  is  not  necessary.  With 
all  distributed  algorithms,  there  is  the  problem  of  beginning  the  pro- 
cedure. That  is,  how  do  the  nodes  know  when  to  start.  For  the  time 
being,  we  ignore  this  problem,  and  assume  that,  somehow,  the  nodes 
receive  a signal  to  start.  Later,  we  will  deal  with  the  problem  of 
executing  these  algorithms  in  an  asynchronous  environment. 
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Algorithm  HH1:  (Gallager) 


Each  node  i executes 


Step  0:  Broadcast  the  identity  "i"  to  all  neighbors  and  receive 
transmissions  from  them. 

Step  It  1)  Record  newly  discovered  identities  and  the  neighbors 
Jl>0 

from  which  they  were  received. 

2)  If  no  new  identities  were  received,  broadcast  the 
message  "done"  and  stop. 

3)  Otherwise,  to  each  neighbor  j,  broadcast  the  iden- 
tities of  all  nodes  not  previously  received  from 

or  broadcast  to  j.  If  there  are  no  such  identities 
to  broadcast  to  j , send  the  message  "nothing  new". 

4)  Receive  transmissions  from  all  active  neighbors. 

One  can  see  that  i has  an  (£.+  1)  hop  minimum  hop  path  to  a node  k 
through  neighbor  j iff 

1)  i first  heard  about  k at  step  i and 

2)  i heard  about  k from  j at  step  l. 


Communication  Cost: 

Each  node  identity  traverses  each  of  the  L duplex  links  in  at 
least  one  direction.  Since  each  identity  is  encoded  into  log  n bits, 
we  have 


CC(MH1)>  Ln  log  n bits 


An  identity  k,  will  traverse  a link  <i, j>  in  both  directions  iff  i 
and  j are  connected  to  k via  equidistant  minimum  hop  paths. 


path  followed  by  k's 
identity. 


In  the  above  graph  i ( j ) will  hear  about  k from  p(q)  at  the  same  step  and 
then  tell  the  other  about  k at  the  next  step.  If  there  exist  such  i,k,j, 
then  if  in  fact  there  is  an  s,  such  that  i,j.  and  s are  part  of  an  odd 

elementary  cycle  in  G.  (A  cycle  [i^ . . . i^*  i^]  in  G is  termed  elementary, 

_ » 

if  there  is  no  other  cycle  in  G containing  exactly  a proper  subset  of 
the  nodes  [i^...fi  ].)  One  can  show  that  G has  no  odd  length  elementary 
cycles  iff  G has  no  odd  length  cycles  iff  G is  bipartite.  This  corresponds 
to  the  fact  that  in  a bipartite  graph,  for  every  link  <i,j>  and  node  k, 
either  i has  a minimum  hOp  path  to  k through  j or  j has  a minimum  hop 
path  to  k through  i.  Note  that  in  the  example,  neither  i nor  j has  a 
minimum  hop  path  to  k through  the  other.  Of  course  the  graph  is  not 
bipartite. 

To  upperbound  CC(MHl),  observe  that  node  i transmits  its  identity 
to  D(i)  neighbors  and  the  identity  of  node  j^i  to  at  most  D(i)-1 
neighbors  , where  D(i)  = degree  of  node  i.  Thus  we  have 
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CC(MHl)  < [ l D(i)  + l l (D(i)-l)]log  n bits 

i i j*i 

■ [2Ln-n(n-l) ] log  n bits. 

There  is  a simple  way  to  get  around  the  odd  cycle  problem  (at  the 
expense  of  doubling  the  number  of  steps  taken  by  MH2)  and  achieve  an 
algorithm  that  uses  Ln  1 g n bits  for  all  networks  with  n nodes  and  L 
duplex  links.  Each  pair  i,j  s.t.  <i,j>e  L arbitrarily  choose  one  of 
the  nodes  to  be  HI  and  the  other  LO.  Then  MHl  is  executed  with  HI  and 
LO  nodes  broadcasting  on  alternate  steps. 


f! 


i 

! 


Algorithm  MH2 : 

Step  (ft;  Node  pairs  exchange  identities  and  choose  HI  and  LO. 

Step  Z:  All  HI  nodes  broadcast  to  their  corresponding  LO  neighbors , 

Z odd  as  in  MHl,  new  identities  learned  at  Step  3.-1. 

Step  Z:  All  LO  nodes  broadcast  to  corresponding  HI  neighbors  new 
Z even  identities  learned  at  Step  Z-2. 

Termination  is  as  in  MHl 

Note  that  a node  can  be  HI  relative  to  one  neighbor  and  LO  relative 
to  another  neighbor  and  it  will  broadcast  to  them  on  alternate  steps . 
In  fact  there  must  be  at  least  one  such  node  unless  G is  bipartite. 

Algorithm  MHl  takes  D(G)  steps  whereas  algorithm  MH2  takes  2D(G) 
steps.  In  both  of  them  each  node  must  perform  a constant  number  of 
operations  for  each  identity  it  receives  - determining  if  the  identity 
is  old  or  new  and  storing  it  if  it’s  new.  Node  i therefore  performs 
0(D(i)*n)  computations. 


At  the  end  of  either  MH1  or  MH2,  each  node  i knows  all  neighbors 
through  which  it  has  a minimum  hop  path  to  k,  Vkf<i.  After  MH1, 
even  more  is  known.  Each  node  i also  knows  its  position  in  MHDX(k) 
relative  to  its  neighbors.  The  rule  for  determining  this  is  as  follows: 

We  say  that  i is  upstream  (downstream)  from  a neighbor 
j in  a dag  if  the  arc  (i,j)((j,i))  is  in  the  dag.  Then 
there  are  three  cases: 

1)  i has  a minimum  hop  path  to  k through  j.  Then  i is  upstream  from 
j in  MHDI (k) . 

2)  i and  j told  each  other  about  k at  the  same  step.  They  are 
not  related,  i.e.  neither  (i,j)  nor  (j,i)  is  in  MHDI (k) . 

3)  Otherwise  i is  downstream  from  j in  MHDI (k) . 

The  same  works  for  MHDO(k)  with  the  roles  of  upstream  and  downstream 
reversed.  This  rule  doesn't  quite  work  with  MH2  because  of  the  al- 
ternation. Consider  the  case  in  which  i and  j have  equidistant  minimum 
hop  paths  to  k.  If  i is  HI  and  j is  LO,  then  i tells  j about  k,  but 
j doesn't  tell  i about  k.  Algorithm  MH2  uses  less  communication  than 
MH1  because  it  resolves  such  ambiguities  in  one  direction  only.  That 
is,  j knows  that  i and  j are  not  related  in  MHDI(k),  but  i only  knows 
that  j is  not  downstream.  There  is  a way  to  resolve  these  ambiguities 
in  Ln  bits.  Since  j knows  the  precise  relationships,  it  can  communicate 
At  the  end  of  MH2 , j sends  i either 


this  to  i. 


1)  a stream  of  n bits  with  a 1 in  position  k if  i and  j are  not 
related  in  MHDI (k) . Otherwise  this  bit  is  a 0.  or 

2)  a list  of  such  nodes  k. 

If  there  are  a such  nodes  k,  j chooses  1)  iff  n<d  log  n.  At  the  end  of 
MH2 , i knows  those  k for  which  j is  downstream  from  i in  MHDI(k). 

For  the  other  nodes,  it  only  knows  that  either  i and  j are  not  related 
or  that  j is  upstream  in  the  appropriate  MHDI.  Using  the  information 
described  above,  i can  resolve  those  ambiguities.  Notice  that  this 
procedure  must  be  done  at  most  once  for  each  <i,j>e  L.  The  total  cost 
of  this  is  therefore  Ln  bits.  So  using  MHl , it  takes  [2Ln-n (n-1) ] logn 
bits  to  find  the  MHDI(k).  Whereas,  using  MH2  and  the  above  information, 
it  takes  only  Ln[logn+l]  bits. 

The  previous  discussion  illustrates  the  significance  of  being  able 

to  wait  and  encode  identities  efficiently.  MHl  resolves  odd  cycle 

ambiguities  one  at  a time  at  a cost  of  logn  bits/identity.  With  MH2 , 

the  LO  nodes  can  wait  and  encode  all  ambiguities  in  approximately  the 

minimum  of  (n,alogn)  bits.  If  n<alogn,  then  the  augmented  MK2  is  better. 

Otherwise  MHl  and  MK2  have  the  same  cost  for  finding  the  MHDI(K),  since 

these  a identities  would  go  from  j to  i in  MHl  anyway.  The  number  a 

depends  on  the  number  of  odd  cycles  in  G,  It  appear  difficult  to  make  a 

precise  statement  about  how  a depends  on  L and  n.  As  an  example,  for 

2 

any  n,  even,  and  n-1  ^ t,  n_  , 3 a connected  bipartite  graph  with  n 

4 


vertices  and  L edges,  and  bipartite  graphs  have  no  odd  cycles. 

We  now  consider  the  problem  of  extracting  minimum  hop  trees  from 
minimum  hop  dags.  A node  i knows  if  a neighbor  j is  upstream  in 
MHDI (k) . However  i doesn't  know  if  j is  upstream  from  any  other  nodes 


In  the  above  example,  there  are  two  possible  trees 
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Node  j is  in  a position  to  choose  either  i or  2.  as  its  downstream  , 

' 

neighbor  in  MHTI (JO  . So  once  the  nodes  have  determined  their  re- 

^at^ve  Positions  in  MHDI (k) , Vk,  they  cam  extract  minimum  hop  trees  as 
follows:  Each  node  j must  do: 

For  each  k^j,  choose  a downstream  neighbor  j in  MHDI(k).  There 
is  at  least  one  such  node.  Tell  this  neighbor  that  you  have  chosen 
it  as  your  downstream  neighbor  in  MHDI (k) . 

Note  that  even  if  j only  has  one  downstream  neighbor  in 
MHDI  (k) , it  must  tell  this  neighbor,  because  j might  not  know  it  is 
a unique  downstream  neighbor.  There  are  a total  of  n(n-l)  such  messages, 
since  each  j must  broadcast  once  about  each  k^j . Thus  this  costs  n(n-l) 
log  n bits,  since  the  message  to  neighbor  j consists  essentially  of  the 
identity  k.  Once  relative  positions  in  MHTI  (k)  have  been  chosen,  the 
nodes  immediately  know  relative  positions  in  a MHTO(k)  , k.  As  an 
example  consider  the  following  MHDO(k). 


P 


If  j chooses  S.  and  p chooses  0,  we  get 


At  the  beginning  of  this  section,  we  mentioned  the  problem  of 
starting  these  distributed  procedures.  The  algorithms  were  presented 
with  the  assumption  that  each  node  somehow  receives  a starting  signal. 
In  fact,  these  algorithms  can  operate  totally  asynchronously,  and  any 
node  can  choose  to  begin.  One  method  for  coordinating  the  nodes  is  the 
following: 

Suppose  node  i,  and  only  node  i,  wishes  to  begin  an 
update.  It  does  so  by  sending  its 
identity  to  its  neighbors.  Consider  some  neighbor 
of  i,  say  j.  When  j receives  node  i's  identity,  j 
"wakes  up"  and  sends  its  own  identity  to  all  its 
neighbors,  including  i.  Consider  a neighbor  of  j, 
say  k.  k is  either  asleep  (and  so  wakes  up)  or 
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has  been  awakened  by  another  node,  when  it  receives 
j's  probe.  Thus  k either  sends  or  has  already  sent 
its  identity  to  j..  A node  pair  nys.t.  <x,y>e  L is 
thus  initiated  into  the  update  when  they  exchange 
identities.  Once  node  i(j)  has  learned  the  identi- 
ties of  all  its  neighbors,  it  sends  this  information 
to  j(i).  After  nodes  i and  j have  done  this  with 
all  neighbors,  each  knows  two  hop  minimum  hop  paths, 
and  then  they  exchange  this  information.  In  general, 
an  arbitrary  node  P will  at  some  point  have  all 
information  about  n hop  minimum  hop  paths.  Node  p 
sends  this  information  to  all  its  neighbors  and 
waits  to  hear  from  them.  When  it  has  received  the 
m hop  information  from  all  its  neighbors,  it  can 
compute  its  own  n+1  hop  minimum  hop  paths,  and  the 
process  is  repeated. 

Observe  that  this  method  also  works  even  if  any  number  of  nodes  indepen- 
dently decide  to  begin  an  update.  In  general  node  p will  be  awakened 
by  probes  from  one  or  more  of  its  neighbors.  Node  p then  sends  its 
identity  to  all  its  neighbors  and  the  process  continues  as  above.  The 
point  is  that  once  a node  pair  x,y  s,t.  <x,y>£L  exchange  identities, 
they  are  initiated  into  the  procedure  and  are  coordinated  from  then  on. 
The  alternation  feature  of  MH2  can  also  be  implemented  in  this  manner. 
When  a node  pair  exchange  identities,  they  can  then  decide  which  is 
HI  and  which  is  LO  and  proceed  as  usual. 


It  is  desirable  for  a distributed  algorithm  to  operate  .correctly 
in  an  asynchronous  environment.  If  one  wishes  to  design  routing 
algorithms  that  work  in  the  presence  of  link  and  node  failures,  it 
is  important  for  the  algorithms  to  allow  a node  to  recognize  some 
local  failure  and  then  independently  initiate  some  procedure  to  inform 


the  other  nodes. 
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III. 3 Broadcasting  All  Arclengths 

It  was  mentioned  in  the  introduction  that  classical  shortest 
path  algorithms  can  be  used  by  one  computer  if  it  has  all  arclengths 
in  memory.  In  this  section,  we  present  algorithms  for  broadcasting 
all  arclengths  to  all  nodes  and  for  transmitting  all  arclengths  to 
one  node  and  then  having  this  node  send  shortest  path  information 
back  to  the  other  nodes.  It  is  assumed  that  each  node  i knows 

1)  its  identity 

2)  its  upstream  and  downstream  neighbors  in  the  minimum  hop  trees, 
MHTO(k)  and  MHTI(k),Vk. 

3)  JUi,j).V  j £ AL(i). 

The  second  assumption  is  not  entirely  impractical  since  the  nodes 
need  learn  this  information  only  once,  and  these  trees  can  be  used 
repeatedly  for  communication  purposes  as  the  arclengths  change.  One 
can  view  this  as  the  distributed  equivalent  of  preprocessing. 

Now  consider  the  following  procedure  for  broadcasting  all  arc- 
lengths  to  a single  destination  j. 


Algorithm  BASD 

Node  j decides  to  do  an  update  and  begins  by  broadcasting  a 
"start"  message  to  its  neighbors  in  MHTO(j).  These  nodes  in  turn 
broadcast  the  "start"  message  to  their  neighbors  in  MHTO(j).  The 
message  keeps  propagating  down  the  tree  until  it  reaches  the  leaves. 
When  a leaf  receives  the  start,  it  broadcasts  its  arclength  list  to 
its  downstream  neighbor  in  MHTI(j).  This  procedure  continues  until 
all  arclengths  reach  the  root  j.  That  is,  when  a node  i has  received 
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information  from  all  of  its  upstream  neighbors  in  MHTI(j),  it 
broadcasts  this  information  together  with  its  own  arclength  list 
to  its  downstream  neighbor  in  MHTI(j). 

Example: 


Node  1 is  the  destination.  MHTI(l)  is  shown  but  other  arcs  in  the 
network  are  not  shown. 

7 sends  ALLST (7)  to  4. 

4 sends  ALLST (7)  and  ALLST (4)  to  2. 

5 sends  ALLST ( 5 ) to  2. 

2 sends  ALLST (7) , ALLST ( 4 ) , and  ALLST ( 5 ) to  1. 

Other  arclength  lists  propagate  similarly. 

Notice  that  the  algorithm  operates  completely  asynchronously.  The 
start  message  can  be  interpreted  as  a ready  command, and  receipt  of 
arclength  lists  from  an  upstream  neighbor  is  an  acknowledgement  from 
that  neighbor. 

An  arclength  £(i,k)  will  traverse  each  arc  in  the  tree  path 
from  i to  the  root  j exactly  once.  The  maximum  number  of  arcs  in 
such  a path  is  just  the  depth  of  MHTI(j).  An  arclength  can  be 


r 


specified  as  a triple  (i,j,  4(i,j))  in  21og  n + log(fc-max)  bits. 
Since  max  Depth (MHTI (j) ) = D(G),  the  communication  cost  of  this 

j 

procedure  is  upperbounded  over  all  possible  destination  nodes  by 


2LD(G) [2 log  n + log (£-max) ] bits. 

Once  the  destination  node  j has  computed  shortest  paths  for  all 

node  pairs,  it  can  send  this  information  back  to  the  other  nodes  via 

minimum  hop  paths.  Each  node  i^j  requires  a neighbor  m^  through 

which  it  has  a shortest  path  to  k,  Vk^i.  The  identity  of  each  such 

best  route  neighbor  can  be  sent  from  j to  i via  a minimum  hop  path 

having  at  most  D(G)  arcs.  Accounting  for  all  i,  we  see  that  node  j 

2 

must  send  the  identities  of  (n-1)  such  best  route  neighbors.  Since 
each  identity  traverses  at  most  D(G)  arcs  and  can  be  encoded  with 
log  n bits,  the  communication  cost  of  this  procedure  is 

0(D(G) (n-1) 2log  n)bits. 

Hence  the  overall  cost  of  sending  all  arclengths  to  one  node  and  then 
having  this  node  send  shortest  path  information  back  to  the  other 
nodes  is 


0[LD(G) (log  n + log(i-max))  + D(G)n  log  n)]bits. 

We  now  consider  the  problem  broadcasting  all  arclengths  to  all 
destinations. 


i 
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Algorithm  BAAD 

1.  Each  node  i begins  by  sending  its  identity  and  arclength  list 
to  its  neighbors  in  MHTO(i) , i.e.  in  its  own  tree. 

2.  When  a node  receives  such  an  arclength  list,  it  examines  the 
source  and  then  sends  this  list  on  to  its  downstream  neighbors 
in  the  KHTO  for  that  source. 

Each  node  k receives  £(r,s)  for  rj<k  exactly  once.  Thus 

CC(BAAD)  =•  2L(n-l) (arclength  messages) . 

where  an  arclength  message  uses  O(log  n + log (i-max) ) bits. 

Again,  observe  that  the  algorithm  is  completely  asynchronous* 
i.e.,  any  node  can  start  the  update, and  transmission  delays  can 
be  arbitrary.  If  some  node  x is  "sleeping"  (i.e.  cot  involved  in 
the  update)  when  it  receives  a transmission  from  one  or  more 
neighbors,  it  "wakes  up",  sends  its  own  arclength  list  down  its  own 
tree,  and  then  sends  other  subsequent  arclength  lists  down  the 
appropriate  trees. 

Observe  that  a variation  of  this  algorithm  can  be  used  to  send 
the  topology  of  the  whole  network  to  all  nodes,  in  0(Ln  logn)bits. 

The  nodes  first  perform  a minimum  hop  algorithm  to  establish  minimum 
hop  trees.  Then  algorithm  BAAD  is  executed  with  nodes  simply  sending 
adjacency  list  rather  than  arclength  lists. 
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III. 4 Worst  Case  Analysis  of  Some  Distributed  Shortest 
Path  Algorithms 

In  this  section,  we  assume  that  each  node  i initially  knows 

1)  its  own  identity  and  the  identities  of  its  neighbors 

2)  the  number  of  nodes  in  the  network,  n 

3)  Z(i,  j),  V j e AL(i). 

Once  again,  we  temporarily  ignore  the  problem  of  beginning  the 
algorithm,  and  methods  for  coordinating  asynchronous  operation  will 
be  discussed  later. 

Algorithm  SPl;  (Distributed  Bellman-Ford) 

(The  reader  may  wish  to  refer  to  Section  II.  1 for  the  discussion  of 
the  centralized  Bellman  Ford  Algorithm) . 

Arclengths  are  not  assumed  to  be  positive.  Each  node  i executes: 

1.  nr*-0 

2.  V j^i  do:  d(i,  j,m)«-  H(i,  j) . 

3.  If  d(i,j,ra)<  °°,  broadcast  d(i,j,m)  to  all  neighbors, 

4.  Receive  transmission  from  neighbors. 

5.  Update  distances  as  follows: 

d(i,  j,m+l)-*-  min{d(i,  j,m)  , min(d(k,j,m)  + Z(i,k^ 

keAL(i) 

6.  nt+TO+1 

7.  If  m>n-l  then  stop.  Else  go  to  step  3, 


L * 
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Although  the  details  have  been  omitted,  it  is  clear  that  each  node 
can  easily  keep  track  of  which  neighbor  is  associated  with  a 
shortest  distance,  i.e.  a neighbor  k s.t.  d (i , j .optimal)  * £(i,k)  + 
d(k,j,  optimal). 

Each  node  i broadcasts  at  most  n times  about  each  of  the  n-1  other 

2 

nodes  in  the  worst  case.  Thus  node  i broadcasts  at  most  D(i)n  dis- 
tances, where  D(i)  is  the  degree  of  i.  Summing  over  all  i,  we  obtain 

2 

CC(SP1)£  2Ln  (distance  messages). 

Observe  that  the  following  improvements  can  be  node  in  practice: 

a)  A node  does  not  initially  have  to  know  the  number  of  nodes 
in  the  network,  but  can  learn  this  as. the  algorithm  is 
executed.  This  is  because  if  after  the  iteration,  a 
node  i has  heard  about  x other  nodes,  with  x<n-l,  then 

node  i must  hear  at  least  one  new  identity  at  the  (m+l)st 
iteration.  Thus,  if  node  i learns  no  new  identities  at 
the  (m+l)st  iteration,  it  knows  the  identity  of  (but 
not  necessarily  a shortest  path  to)  every  other  node  in 
the  network. 

b)  Node  i need  only  broadcast  d(i,j,m+l)  to  a neighbor  k 

if  d(i,j,m+l)<  d(i,j,m)  and  d(i,j,m+l)<  min{d(k,j,£) 

£<m 

received  from  k}. 

c)  If  d(i,j,m+l)  - d(i,j,m)  , v jj'ii  then  node  i can  stop. 


d)  If  i.(i,j)>^  0 V(i,j)E  A,  then  at  each  iteration  node  i can 
mark  the  smallest  unmarked  distance  as  being  a shortest 

A 

distance.  Initially,  node  i marks  any  distance  Z.(i,j)  s.t. 
i(i,3)  = min{i(i, j) [ jEAL(i) } as  being  a shortest  distance. 

The  proof  of  correctness  of  this  is  completely  analogous 
to  the  proof  of  correctness  of  Dijkstra's  algorithm. 

(see  II. 1)  Thus  at  iteration  m,  node  i will  have  at  most 
n-m+1  possible  improved  distances  to  broadcast.  This  re- 
duces the  communication  cost  by  approximately  a factor  of 
n 2 

r n 2 

2 since  \ n-]ss  •=—  rather  than  n . 

j=l 

Notice  that  if  £(i,j)=a  positive  constant  V (i,j) £ A,  then  algorithm 

SPl  with  improvements  essentially  reduces  to  the  minimum  hop  algorithm 

MH1  (see  III. 2)  and  only  O(Ln)  distance  messages  are  transmitted. 

2 

Unfortunately,  the  0(Ln  ) upperbound  is  tight  even  if 

2 

£(i»j)  >0<V(i*j)  £ A.  In  particular,  the  algorithm  is  neither  0(1,  ) nor 

3 2 

0(n  ),  either  of  which  is  better  than  0(Ln  ).  Examples  to  demons- 

2 

trate  the  tightness  of  the  0(Ln  ) upperbound  are  presented  in 
Appendix  A.  One  could  conceivably  use  other  modifications  or  pre- 
processing (for  example,  assume  the  nodes  know  the  topology  of  the 

2 

network)  to  further  reduce  the  constant  factor  so  that  CC(SPl)^  cLn  , 
with  c<2,  or  to  efficiently  encode  distances.  However,  such  discus- 
sion has  been  omitted  since  there  are  asymptotically  more  efficient 

algorithms  for  graphs  with  positive  arclengths.  In  III. 5,  we  will 

% 

discuss  methods  for  making  these  algorithms  as  efficient  as  possible 
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in  terms  of  a bit  communication  cost  rather  than  just  a "message" 
cost. 

We  now  turn  to  a discussion  of  two  Di jkstra-like  procedures  that 
can  be  used  if  all  arclengths  are  positive. 

Algorithm  SP2 ; 

Recall  that  N is  the  set  of  nodes  in  the  network  and  that 
ALLST(k)  is  the  arclength  list  of  node  k. 

Each  node  i executes: 

1.  T «-  N-{i} 

2.  V j £ AL  (i)  do:  [d(i,  j)  ,NT(j)  ]+-[ft(i,  j)  , j] 

3.  V j t AL(i)  do:  (d(i,  j)  ,NT(  j)  ]«-[«»,  blank] 

Comment:  NT(j)  is  the  identity  of  a neighbor  through  which  i has 
a path  to  j of  distance  d(i,j).  Initially,  NT(j)=j 
if  j £ AL(i)  and  NT ( j ) is  undefined  (blank)  if  j i AL(i). 

4.  Let  £ be  such  that  d(i,k)  = min{d(i,k) |k£T}. 

5.  T «-  T - {k} 

A A 

6.  If  ALLST(k)  is  not  in  memory,  send  a request  to  k for  its 
arclength  list.  The  request  traverses  the  minimum  hop  path 

A 

from  i to  k and  is  answered  by  the  first  node  along  the  path 
having  the  information. 

7.  V s £ AL(k)  do: 

If  £(k,s)  + d(i,k)<  d(i,s)  then 

' [d(i,s).NT(s)]«-[d(i,k)  + H(k,s)  ,NT(k)  ] . 


\ 


-49- 


8.  If  then  stop.  Else  go  to  4. 

In  parallel  node  i executes  the  following  communication  process : 

1.  .When  a probe  of  the  form  {Request  ALLST(k)}  arrives  from  a 

neighbor  j do: 

a)  If  ALLST(£)  is  in  memory,  send  it  to  j. 

A 

b)  If  ALLST(k)  is  not  in  memory  do. 

Record  the  fact  that  j wants  ALLST(k).  If  node  i 
has  not  requested  ALLST(k) , send  a request  to  the 
downstream  neighbor  in  MHTI (k) . 

2.  When  ALLST(k)  is  received,  send  it  to  all  neighbors  that  have 
requested  it. 

Node  i receives  each  arclength  l( r,s)  for  r^i  at  most  once. 
Further,  each  node  i requests  every  other  arclength  list  at  most 
once . Thus 

CC(SP2}£  2L(n-l)  arclength  messages  + n(n-l)  request  messages. 

We  also  observe  that  the  algorithm  can  be  executed  completely 
asynchronously,  and  the  coordination  required  is  analogous  to  that 
of  the  previous  asynchronous  algorithms. 

From  a computational  point  of  view,  algorithm  SP2  is  exactly 
Dijkstra's  algorithm.  Since  nodes  only  begin  with  local  information 

A A 

though,  they  must  request  ALLST(k)  after  k has  been  labelled  in  order 
to  find  the  next  shortest  path.  Communication  takes  place  over  the 
minimum  hop  trees  in  order  to  insure  that  each  arclength  list  is 
received  only  once  by  each  node.  Once  a node  j has 


■ 
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received  ALLST  (k)  , a subsequent  request  for  ALLST(k)  made  by  a node  i 
that  is  upstream  from  node  j in  MHTI (k) , can  be  answered  by  node  j. 
Thus  from  a communications  point  of  view,  algorithm  SP2  is  essentially 
equivalent  to  the  procedure  for  broadcasting  all  arclengths  to  all 
nodes  (Algorithm  BAAD  - see  III. 3)  in  that  ALLST(k)  travels  to  all 
other  nodes  via  paths  in  MHTO(k).  The  notion  of  a request  mechanism 
was  introduced  to  interleave  the  communication  and  computation. 
However,  the  communication  and  computation  structures  are  not  co- 
ordinated since  the  minimum  hop  trees  and  shortest  paths  trees  need 
not  be  the  same.  Thus,  algorithm  SP2  does  not  fully  exploit  the 
distributed  character  of  the  shortest  path  problem  that  arises  from 

f 

the  optimality  principle.  This  principle  implies  that  if  the  path 
[i2<  ...i^]  has  been  examined  by  node  i^  and  found  to  be  non-optimal, 
then  the  path  [i^,  i2,...,i^]  need  not  be  examined  by  i^  since  it 
cannot  be  optimal.  Thus  if  the  path  [i  ^ is  a shortest 

path  and  node  i asks  node  i for  ALLST ( i ),  node  i should  not 

j"  £ 6 

give  the  arclength  to  node  i^  since  the  path 

is  not  optimal.  Essentially,  node  i^  can  "prune"  a part  of  node 
i^’s  potential  shortest  path  tree.  We  now  present  the  algorithm 
formally. 

Algorithm  SP3;  [10] 


Each  node  i executes*. 
1.  T N-{i} 


f 


l 


2. 


3. 


4. 


5. 


6. 


7. 
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P •»-  {i} 

V j e All (i)  do:  [d (i , j ) , NT(  j)  ]*-[&(i,  j)  , j] 

V j I AL(i)  do:  [d  (i , j ) , NT(j)  blank] 

CANDIDATES  (i,i)-  ALLST(i) 

Vs^i  do:  CANDIDATES  (i,s)«-  $ 

VjEN-{i}  do:  FATHER(j)<-  i 


Comment: 

CANDIDATES  (i,s)  is  the  set  of  arclengths  2(srk)  that  are  useful 
to  node  i,  that  is,  those  arclengths  that  correspond  to  arcs  which 
are  in  shortest  paths  or  potentially  in  shortest  paths  from  node  i 
to  other  nodes.  Since  node  i initially  knows  only  its  own  arclengths, 

J 

CANDIDATES  (i,i)  *■  ALLST(i)  and  CANDIDATES  (i,s)<-  0 for  s/i- 
FATHER  (i,s)  is  the  immediate  predecessor  of  node  s in  the  path  from 
i to  s having  a length  equal  to  the  present  value  of  d(i,s), 

FATHER  (i,s)  must  be  remembered  in  order  to  do  pruning. 


8.  m •*- 1 


9.  Let  k , . 


. be  such  that  d(i,k.)  « min{d(i,k) |keT}  for  1 £ j 
b (m)  3 


Comment: 

The  nodes  k k.  , . are  the  unlabelled  nodes  that  are 
1 b (m) 

closest  to  i.  The  number  b(m)  will  possibly  vary  on  each  iteration. 
We  have  previously  presented  Dijkstra's  algorithm  with  only  one  node 


Li 


< b(m) . 


' 
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at  a time  being  processed,  that  is,  a node  i would  select  any 

node  that  minimized  tentative  distances.  However,  it  is  Certainly 

correct  and  potentially  faster  to  label  all  nodes  k ,...,k  at 

1 b (m) 

once. 

10.  For  j=l  to  b(m)  do: 

T+T  - {kj 
MU  {kJ 

A f A 

If  ALLST(k^)  is  not  in  memory,  request  it  from  NT(k.) 
end; 

11.  For  j=l  to  b(m)  do: 

For  each  arclength  2(k^,s)  received  do: 

If  d(i,k^)  + A(k  ,s)<  d(i,s)  then  do: 

CANDIDATES  (i,k>  CANDIDATES  (i  ,k  ) (J  U(k.  ,s)  } 

If  d(i,s)<  ® then 

CANDIDATES  (i,  FATHER  (s)  )•*■  CANDIDATES  (i  .FATHER  (s)  ) - 

U(FATHER(s) ,s)  } 

[d(i,s) ,NT(s) ]*-[d(i,k^)  + 2 (k^ , s) ,NT (k^ ) ] 

FATHER (s ) «-  k. 


Comment : 

Pruning  is  done  in  step  # 11.  if  d(i,s)<  <*>  then  d(i,s)  is  the 
length  of  an  actual  path  in  the  graph  going  from  i to  s through 
FATHER(s).  If  d(i,kj)  + 2,(kj,s)<  d(i,s)  then  the  path  from  i to  s 

A 

through  k^  is  better  and  so  we  delete  l (FATHER (s) ,s)  from 
CANDIDATES ( i , FATHER ( s ) ) . 


12.  For  j»l  to  b(m)  do: 

A 

If  there  are  any  requests  for  ALLST (k  ^ ) , send 
CANDIDATES (i ,k ^ ) 
end  ; 

13.  If  T“<£  stop.  Else  do:  ra  m+1 

Go  to  9, 

In  parallel,  node  i executes  the  following  communication  process. 

1.  If  a neighbor  j requests  ALLST(r)  then  do : 

If  r£P,  send  j CANDIDATES (i,r) , 

Else  record  the  fact  that  j has  requested  ALLST(r) , 

Observe  that  as  is  the  case  with  algorithm  SP2 , each  node  i 
can  learn  the  number  of  nodes  in  the  network  as  it  executes  the 
procedure.  We  further  observe  that  algorithm  SP3  can  be  executed 
completely  asynchronously.  Any  node  may  choose  to  begin  an  update 
and  "awaken  " its  neighbors.  These  nodes  in  turn  begin  their 
updates  and  wake  up  their  neighbors  etc.  However,  there  is  a potential 
for  deadlock  that  does  not  arise  with  SP2.  There,  a node  i immediately 

A 

forwards  any  request  for  ALLST(k)  on  to  its  downstream  neighbor  in 
MHTI (k) , if  node  i does  not  have  the  information.  The  request  can 
keep  propagating  down  the  tree,  but  can  eventually  be  answered  since 

A A 

node  k knows  ALLST(k)  by  assumption.  With  SP3  though,  a node  j 


passes  on  ALLST(k)  to  some  neighbor  i that  requested  it,  only  after 
j has  processed  it  (by  process,  we  mean  execute  step  11  of  SP3  on  the 
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arclengths  received).  In  fact,  the  whole  point  of  pruning  is  that  if  i 
has  a shortest  path  to  k through  j,  j should  process  ALLS'! (k)  before 
i does.  If  transmission  delays  are  arbitrary,  j may  not  have 
processed  ALLST (k)  by  the  time  that  i wants  it.  This  point  is  il- 
lustrated  by  the  following  example: 


Node  1 initially  labels  nodes  2,4,  and  -5  and  node  2 initially  labels 
5.  After  node  1 requests,  receives,  and  processes  the  arclengths 
from  2,4,  and  5 (by  process  we  mean  execute  step  11  of  algorithm 
SP3  on  the  arclengths  received  from  2,4,  and  5),  it  realizes  that 
the  path  P[l, 3]-[l,2,3]  is  the  next  best  shortest  path,  and  it  asks 


node  2 for  ALLST ( 3 ) . If  node  5 is  slow  in  responding  to  2's  request 
for  ALLST ( 5 ) , node  2 will  not  have  processed  ALLST (3)  by  the  time 


node  1 requests  it,  i-e.  2 is  waiting  on  5 before  it  proceeds. 

This  leads  one  to  suspect  that  there  is  the  possibility  of  having  a 


cycle  [i^, . . . ,ik=i^]  such  that  node  i^  has  requested  some  information 


from  node  i^+^,  but  i^+i  does  not  have  it,  l££<k-l.  In  this  case. 


there  will  be  a deadlock.  Fortunately,  this  cannot  occur  if  all 
arclengths  are  positive.  We  show  this  for  a 3 node  cycle. 


In  the  above  figure,  k(l,2)  is  the  node  whose  arclength  list  has  been 
requested  by  1 from  2,  that  is  d(l,k(l,2))  = £(1,2)  + d(2,k(l,2))* 
k(2,3)  and  k(3,l)  are  analogously  defined.  Since  algorithm  SP3 
processes  nodes  in  order  of  increasing  distance  from  the  source, the 


r 
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fact  that  2 has  not  processed  (i.e.  executed  step  11)  the  arclengths 
of  k(l,2)  means  that  d(2,k(2,3))<  d(2,k(l,2)).  If  the  deadlock  exists 
this  inequality  must  hold  for  the  other  node  pairs  as  well,  and 

we  obtain 

d(l,k(3,l))>  d (1 ,k (1, 2) ) 

d(2,k(l,2))>  d(2,k(2,3))  III. 4-1 

d(3,k(2,3))>  d(3,k(3,l) ) 

But  if  £(i,j)>  0.v(i,j)e  A we  also  have 

d(l,k(l,2) )>  d(2,k(l,2) ) 

d(2,k (2 , 3) ) > d(3,k(2,3) ) III. 4-2 

d (3,k  (3,1) ) > d(l,k(3,l) ) 

9 

Combining  III. 4-1  and  III. 4-2  we  see  d(l,k(3,l))>  d(l,k,(3,l))  which 
is  a contradiction.  We  also  see  that  in  fact  we  only  need  one  of 
the  inequalities  in  III. 4-2  to  be  strict,  i.e.  only  one  of 
£(1,2),  £(2,3),  £(3,1)  need  be  strictly  positive.  This  argument 
clearly  extends  to  an  arbitrary  cycle  and  so  we  conclude  that  if 
£(i»j)^.  0 and  there  are  no  rero  length  cycles,  then  in  any  cycle  of 
requests,  there  must  be  at  least  one  node  that  can  answer  the  request 
made  of  it  without  waiting  for  its  own  request  to  be  answered. 

Communication  Cost: 

Each  node  i requests  ALLST(j)  for  j^i  and  receives  each  arc  in 
ALLST(j),  j?<i,  at  most  once.  Thus  CC(SP3)<  2L(n-l)  arclengths 
+ n(n-l)  request  messages.  Unfortunately , we  have  been  unable  to 
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quantify  the  effectiveness  of  pruning.  This  is  because  pruning 
depends  very  heavily  on  the  topology  of  the  graph,  and  for  given 
values  of  n and  |l|,  there  are  many  graphs  on  n nodes  and  |l|  links 
that  have  distinctly  different  topologies.  One  can  construct 
graphs  in  which  one  node,  even  with  priming,  will  receive  a significant 
number  of  the  arclengths.  This  does  not  mean  all  nodes  will  receive 
a lot  of  arclengths.  It  is  just  difficult  to  quantify  the  inter- 
active effects  of  pruning.  Thus,  the  above  bound  on  CC(SP3) 
basically  states  that  the  total  worst  case  communication  cost  is  just 
n times  the  worst  case  amount  of  information  that  one  node  must 
receive.  One  useful  "rule  of  thumb"  is  that  pruning  tends  to  be 
relatively  more  effective  in  those  graphs  that  have  relatively  long 
(in  terms  of  number  of  arcs)  shortest  paths,  because  there  are  re- 
latively more  intermediate  nodes  in  such  paths  to  do  the  pruning. 
Algorithms  SP2  and  SP3  also  have  an  advantage  over  algorithm  SP1  in 
that  only  arclengths  and  not  path  distances  must  be  transmitted. 

This  reduces  the  bit  cost  of  a message  by  approximately  log  n bits 
since  a path  length  can  be  (n-1) (£-max)  which  requires  approximately 
log  n + log(i-max)  bits  to  encode. 


III. 5 Preprocessing  and  Message  Encoding 


In  III. 3,  we  assumed  that  each  node  knows  its  position  in  the 
various  minimum  hop  trees  for  purposes  of  broadcasting  all  arclengths, 
and  we  viewed  this  as  a distributed  version  of  preprocessing.  In 
this  section,  we  investigate  some  impro%’ements  to  algorithm  SP3  that 
can  be  made  by  assuming  that  all  nodes  know  the  topology  of  the 
network.  More  specifically,  we  assume  that  each  adjacency  list  is 
assigned  some  order  and  that  all  nodes  know  all  ordered  adjacency 
lists. 

Firstly,  we  observe  that  for  each  <x,y>  in  L,  the  source  node  i 

needs  to  know  at  most  one  of  £.(x,y)  and  2,(y,x).  If  x is  labelled 

before  y is,  then  the  arc  (x,y)  is  potentially  in  a shortest  path 

from  i through  x to  y.  and  so  node  i can  use  l(x,y).  When  y is 

labelled,  however,  the  arc  (y,x)  is  of  no  use  since  node  i already 

has  a shortest  path  to  x that  doesn't  visit  y.  If  x and  y are 

labelled  at  the  same  iteration,  then  neither  the  arclength  £(x,y) 

nor  the  arclength  £(y,x)  is  of  any  use  to  node  i.  The  problem  is 

that  if  the  shortest  path  from  i to  x goes  through  a neighbor  of 

i,  say  j , and  the  shortest  path  from  i to  y goes  through  j , with 
x y 


j ft  j , then  j may  not  necessarily  know  that  i has  already 
y x y 

labelled  x and  hence  j will  give  i the  arclength  i(y,x) . Thus  we 
need  an  efficient  way  for  i to  tell  its  neighbors  which  arclengths 
are  potentially  useful.  To  do  this,  we  introduce  the  notion  of 
candidate  bit  vectors.  Suppose  that  the  ordered  adjacency  list 
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i 


of  a node  p is  {p, . , } where  D(p)  is  the  degree  of  p-  Then, 

1 Dtp) 

when  i labels  p and  requests  the  arclengths  of  node  p from  a neighbor 

of  i,  say  j,  node  i gives  j a vector  of  D(p)  bits  with  a 1 in  the 
th 

k — position  iff  node  p is  unlabelled  by  i,  i.e.  iff  i does  not  yet 
know  a shortest  path  to  p^.  The  information  in  the  bit  vector  allows 
j to  know  which  arcs  are  still  candidates  as  far  as  i is  concerned. 

We  further  note  that  node  i may  even  be  able  to  use  fewer  than  D(p)  bits. 
Suppose  that 

1)  p^  e AL(p)  and  p^  e AL(s). 

2)  The  shortest  paths  from  i to  both  s and  p go  through  a 

neighbor  j.  • 

3)  Node  i labelled  nodes  p^,  s,  p in  that  order  and  s was  the 

first  node  that  i labelled  after  p,  such  that  the  shortest 

k 

path  from  i to  s went  through  j . 


Then,  when  i requested  the  arclengths  of  s from  j,  i gave  a bit  to 
j that  indicated  p^  was  already  labelled.  Hence  when  i labels  p 
and  asks  j for  the  arclengths  of  p,  i does  not  even  have  to  give  a 
bit  for  the  arc  (p,p^)  since  j already  knows  it  cannot  be  useful  to 
i.  To  accomplish  this,  nodes  i and  j must  both  maintain  a list,  say 


P„,  of  those  nodes  that  j knows  i has  labelled.  Node  i of  course 
maintains  its  own  list,  say  P^  of  all  nodes  it  has  labelled.  When  i 


labels  p and  asks  j for  the  arclengths  of  p,  it  constructs  the  fol- 
lowing bit  vector: 
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Suppost  AL(p)  = {p^,...,po  },  and  p^  C AL(p) . There  are  3 cases. 


1)  p^  £ P_.  Node  i does  not  need  to  give  a bit  since  j knows  i 


has  labelled  p,  . 

k 


2)  p^  e P^-P  . The  corresponding  bit  is  0 since  i has  labelled 


and  so  doesn't  need  the  arclength  2.(p,Pk>. 


3)  p^  e N-P^  where  N = set  of  nodes  in  the  network.  In  this  case, 


the  arclength  l (p.p^)  is  possibly  useful  and  the  bit  is  1. 


Now  j receives  a vector  of  bits  b ,...,b  (s  < |AL(p)|  ) where  a 

1 S “ 

th 


bit  b^  corresponds  to  a node  p^  such  that  p^  is  the  r — node  in 

r r 

AL(p)  that  is  not  in  P. ..  If  b =0,  1 knows  i has  labelled  p since 

* — — il  r t 

— r 

i last  requested  some  arclengths  from  j.  In  this  case  j adds  p 


to  P.  . . If  b *1,  j knows  that  £(p,p  ) is  potentially  useful  to  i and 

J r 


so  gives  the  arclength  to  i unless  it  (i.e., j)  already  pruned  it. 
(Recall  that  by  the  optimality  principle,  whatever  is  not  useful  to 
j cannot  be  useful  to  i) . The  exact  number  of  bits,  which  is 


| AL(p)  |- | AL(p)OP^_.  | , depends  very  much  on  the  topology,  and  so  we 


are  unable  to  make  any  general  statement.  We  can  only  say  that  a 


source  node  i will  give  at  most  J |al(s) | bits  in  total,  i.e. 

s=l 

s^i 


one  bit  vector  of  at  most  |AL(s) | bits  when  it  requests  the  arclengths 
of  s,  for  each  s^i.  Hence  all  nodes  use  at  most  2L(n-l)  bits  for 
all  candidate  bit  vectors.  With  this  scheme,  each  node  i will 
receive  at  most  L- | At, ( i ) | arclengths  and  so  overall,  at  most  L(n-2) 


arclengths  are  transmitted.  As  before,  any  other  pruning  that 
occurs  will  reduce  this  cost  even  further.  If  an  arclength  message 


uses  b bits,  then  this  scheme  uses  the  fraction 


2L(n-l)  -i-  t, (n-2) b ~ b+2 
2Un-l)b  ~ 2b 


as  L,n 


of  the  number  of  bits  used  by  algorithm  SP3. 

Another  improvement  that  can  be  made  with  preprocessing  is  in 
the  encoding  of  arclength  messages.  We  have  previously  said  that 
an  arclength  can  be  specified  as  a triple  (x,y,£(x,y))  in 
21ogn  + log(2.-max)  bits.  Now  if  node  i requests  the  arclengths  of  x 
from  j,  node  j clearly  does  not  have  to  specify  x in  the  triple  since 
i knows  x.  We  further  observe  that  if  each  node  knows  the  topology, 
then  node  j need  only  specify  the  identity  of  y relative  to  other 
nodes  in  AL(x) . This  can  be  done  in  log(D(x))  bits,  where 
D(x)  = degree  of  x.  For  sparse  graphs,  i.e.  D(x)«n,  this  results 
in  an  improvement  by  a factor  lo9  ^DJX^)  bits  and  may  be  significant 
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IV.  AVERAGE  C0M17JNICATI0N  COST  ANALYSIS 


IV. 1 Motivation 

Thusfar,  we  have  developed  an  [Ln  arclength  messages  + 2Ln  bits] 
distributed  shortest  path  algorithm  and  a 2tn  arclength  messages 
algorithm  for  broadcasting  all  arclengths  to  all  nodes.  (Algorithm 
BAAD  - Section  III. 3).  Observe  however,  that  the  latter  procedure 
has  a constant  cost  over  all  networks  with  n nodes,  L links,  and  any 
arclength  assignment,  whereas  the  pruning  of  algorithm  SP3  may 
reduce  the  cost  of  that  algorithm  to  be  asymptotically  smaller  than 
0(Ln),  at  least  for  certain  topologies.  Still,  it  is  somewhat  disap- 
pointing that  we  have  not  found  a shortest  path  algorithm  whose  worst  case 
communication  cost  is  provably  less  than  that  of  algorithm  EAAD. 

More  specifically,  is  there  is  a distributed  shortest  path  algorithm 

2 

whose  communication  cost  is  upperbounded  by  an  arclength  messages 
for  some  constant  a?  We  note  that  for  classes  of  sparse  graphs  (say 

the  degree  of  each  node  < a constant  3 that  is  independent  of  n) , the 

. 2 2 
O(Ln)  bound  is  0(n  ).  However  the  constant  multiplying  the  n term 

increases  with  3-  The  following  "plausibility  argument"  is  meant  to 

indicate  to  the  reader  some  of  the  reasons  that  will  make  it  difficult 

2 

to  ever  find  an  0(n  ) algorithm  for  sparse  graphs,  if  we  have  each 
node  execute  the  same  single  source  algorithm.  (We  term  such 
algorithms  homogeneous  because  all  nodes  execute  the  same  procedure) . 
Consider  the  following  ring  network. 


It  is  plausible  that  for  this  graph,  node  1 must  examine  at  least  k-1 
arclengths  in  the  worst  case,  even  if  node  1 knows  the  topology. 
Hence,  it  is  plausible  that  for  the  following  graph,  node  1 must 
examine  0(L)  arclengths  in  the  worst  case. 
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Even  with  pruning  and  knowledge  of  the  topology,  node  1 must  in 
the  worst  case  resolve  all  potential  ties  created  by  elementary 
cycles.  Since  our  techniques  do  not  really  enable  us  to  distinguish 
one  source  from  another,  we  can  only  say  that  the  worst  case  total 
is  just  n times  the  worst  case  for  one  node.  This  yields  the  0(Ln) 
bound. 

As  we  see,  the  crux  of  the  problem  lies  in  precisely  determining 
the  interactions  of  the  various  single  source  problems,  solely  in 
terms  of  n and  L.  This  appears  to  be  difficult  in  view  of  the  large 
number  of  different  topologies  that  can  exist.  The  ^erties  of 
homogeneity  and  topology  independence  are  useful  in  ?:  since 

they  permit  the  same  program  to  be  used  at  all  nodes,  regardless  of 
topology.  While  one  could  conceivably  design  a more  efficient  algo- 
rithm for  a particular  network,  this  approach  seems  to  be  somewhat 
impractical  since  this  algorithm  may  be  very  inefficient  on  some  other 
network.  Because  the  addition  or  deletion  of  a few  choice  nodes  can 
significantly  alter  some  topologies,  one  may  have  to  develop  a new 
reoptimized  algorithm  for  each  new  graph.  In  our  algorithms,  the  nodes 
may  indeed  know  the  topology,  but  nothing  special  is  assumed  about 
it,  i.e.  the  topology  is  part  of  the  input. 

With  this  in  mind,  we  consider  the  communication  cost  of  a 
distributed  shortest  path  algorithm  when  averaged  over  random 
arclengths.  This  analysis  is  motivated  by  the  fact  that  the  lengths 


assigned  to  the  links  of  data  networks  for  purposes  of  routing  may 
be  appropriately  modelled  as  random  variables.  Since  shortest  path 
updates  may  be  performed  relatively  frequently,  our  average  case 
analysis  will  perhaps  be  relevant  to  the  average  communication  cost 
of  performing  these  updates. 


i 
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IV. 2 Average  Computation  Analysis  of  Spira's  Algorithm 

We  briefly  review  Spira's  algorithm.  (See  II. 1 for  a detailed 
presentation.)  Spira's  procedure  is  similar  to  Dijkstra's  except 
that  arcs  from  a particular  node  are  examined  one  by  one  in  order  of 
increasing  arclength.  Recall  that  this  enables  us  to  find  the  next 
best  path  in  only  O(log  n)  comparisons  using  played  binary  trees. 
Suppose  node  1 is  the  source.  At  each  stage  of  the  algorithm,  there 
is  a tree  (rooted  at  node  1)  of  shortest  paths  to  j other  labelled 
nodes.  In  addition,  from  each  labelled  node  p,  there  is  a one  arc 
extension  which  is  the  last  arc  in  the  next  best  path  that  is  a 
one  arc  extension  of  the  path  from  1 to  p. 


— ->‘ 


Pe  r to**  Avtj 
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We  find  the  shortest  of  all  such  paths  say  [l,...,p,x],  where  p is 

A 

in  the  tree  and  (p,x)  is  the  one  arc  extension.  If  x is  unlabelled 

A 

(new)  then  we  have  found  a shortest  path  from  1 to  x,  and  the  tree 
grows  by  one  node.  If  x is  already  labelled,  this  path  is  of  no  use 
since  we  already  have  a shortest  path  to  x.  Thus  the  number  of 
iterations  (i.e.  # of  times  we  play  the  tree  to  find  the  next  best 
path)  depends  on  how  often  we  are  unlucky  and  find  a path  to  a 
labelled  node.  Fortunately,  the  average  number  of  such  unlucky  trials 
can  be  computed.  More  precisely  we  have: 

Lemma:  [5] 

Let  G = (N,A,£)  be  a weighted  digraph  on  n nodes  such  that 
1)  A = { (i, j) | i^j} 

2.  The  arclengths  are  independent,  identically  distributed 
nonnegative  random  variables. 

Then  the  average  number  of  iterations  Spira's  algorithm  makes  to 
solve  the  single  source  problem  for  any  source  node  i is  0(n  logn) 
provided  ties  are  broken  randomly  when  the  arclength  lists  are  sorted  and 
when  the  binary  tree  is  played  to  find  the  next  best  path. 

Remark:  Property  1)  simply  means  that  G contains  all  possible  n(n-l) 
arcs.  G is  called  a clique  on  n nodes.  The  random  tie  breaking 


T* 


— 
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rule  is  important.  In  [11],  it  is  shown  that  for  a certain  "plausible" 

2 

deterministic,  tie  breaking  rule,  0 (n  ) iterations  may  be  used  on  the 
average  for  certain  probability  distributions  on  the  arclengths. 

Proof:  Let  z.  be  a {0,l}  valued  random  variables  s.t.  z^»l  the 
path  we  examine  leads  to  a new  node  given  that  we  have  found 
shortest  paths  to  j nodes.  Now  we  claim  that 


Pr[z  *1]  > n-j 


IV. 2-1 


Let  p be  one  of  the  j labelled  nodes.  The  set  of  unexamined  arcs  of 
p can  be  partitioned  into  two  subsets. 

UAO  = {unexamined  arcs  leading  to  old  nodes} 

UAN  = {unexamined  arcs  leading  to  new  nodes/ 

Now  we  observe  that  ] UAO  j + [ UAN  | < n-1  (we  may  already  have  examined 

and  discarded  some  arcs  and  in  this  case  |uao|+|uan|  < n-1) 

and  | UAN | = n-l-j  since  every  node  has  n-1  outgoing  arcs  and  only  j 

nodes  cure  labelled.  Because  all  arclengths  are  i.i.d.  random 

variables  and  ties  are  broken  randomly,  the  next  one  arc  extension 

from  p is  equally  '.likely  to  go  to  any  node  that  is  the  sink  of  an  unexamined 

arc.  Hence  the  probability  that  this  one  arc  extension  from  p goes 


to  a new  node  is  just 


UAN  > n-l-j  > n-j 

UA0|+|UAN|-  n-1  - n ’ 


Now  this 


'12? 


argument  holds  for  any  of  the  labelled  nodes  and  so  IV. 2-1  follows. 


Hence  the  expected  number  of  paths  we  must  examine  before  we  find  a 


shortest  path  to  a new  node,  given  that  we  have  labelled  j nodes  is 

< -2- 
- n-j 


Summing  over  all  j we  obtain  that  the  overall  average  number 
n-1 


of  iterations  is  < £ » „ 0(n  logn) 

n-j 


j-1 


Since  the  graph  is  a clique,  this  argument  clearly  holds  for  any 
source  i. I I 


Because  the  bound  holds  for  any  source  i and  because 

the  number  of  comparisons  on  each  iteration  is  0(  logn)  (using  played 

binary  trees) , we  conclude  that  the  all  pairs  problem  for  a clique  with 

random  nonneg.  arclengths  can  be  solved  as  n single  source  problems  on  the 
2 2 

average  in  0(n  log  n)  comparison  + # comparisons  needed  to  sort  the 

2 

arclength  lists.  Recall  that  this  latter  quantity  is  just  0(n  logn). 

Even  for  a single  source  problem  we  must  sort  all  arclength  lists  and 

2 2 
so  in  this  case,  the  0(n  logn)  sorting  cost  dominates  the  0(n  log  n) 

cost  of  performing  the  rest  of  the  algorithm  for  that  source.  In 

2 2 

the  all  pairs  problem,  the  0(n  log  n)  cost  is  dominant, and  the  sorting 

is  worthwhile.  In  his  paper,  Spira  also  shows  that  the  variance  in 

2 

the  number  of  iterations  for  a single  source  problem  is  at  most  3n  . 
However,  the  number  of  iterations  required  for  the  various  single 
source  problems  need  not  be  independent.  Thus,  he  can  only  bound 

4 

the  variance  of  the  total  number  of  iterations  for  all  sources  by  3n  . 
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We  now  observe  that,  unfortunately , this  0(n  logn)  results  does 

not  necessarily  hold  if  G is  not  a clique.  There  are  two  basic 

problems.  Firstly,  since  every  node  does  not  necessarily  have  an  arc 

I UAN  I 

to  every  other  node,  -i ‘ — . is  not  necessarily  lower  bounded  by 

J UAN | + UAO | 

. The  exact  value  depends  on  the  topology.  In  fact,  for  the 
n 

following  graph,  the  expected  number  of  iterations  to  solve  the  SS 
2 

problem  is  0(n  ),  when  each  of  the  nodes  l,...,n/4  is  a source. 

3 

Thus  the  overall  number  of  iterations  is  0(n  ),  for  the  AP  problem. 


In  this  graph,  the  sets  of  arcs  is 

{ (i,i+l)  | l<i<n-l}U{(i,j)  | ^ <j<i,  i>|}  . 

For  a source  k,  k < ^ , the  nodes  will  initially  be  labelled  in  the 
order  k,  k+l,...,n/2.  Once  we  reach  ^ however,  only  0(  — ) of  the 


arcs  lead  to  new  nodes.  Hence,  we  will  examine  O(n')  arcs  out  of 


~2  • °n  the  average.  The  same  clearly  holds  once  any  j>^  j is  labelled. 

So,  for  a source  k,  we  will  examine  j • 0(n)  arcs  =0(n2)  arcs.  The 
second  problem  is  that  the  average  will  not  necessarily  be  the  same 
for  all  sources  in  general.  In  the  above  example,  if  we  take  node  n 
as  the  source,  we  immediately  have  many  arcs  to  new  nodes  and  thus 
the  average  will  be  different  from  that  when  node  1 is  a source. 

Spira's  result  does  hold  however,  if  we  average  over  random 
graphs,  because  the  averaging  enables  us  to  make  a statement  about  the 
probability  that  a certain  arc  leads  to  a new  node.  More  precisely 
we  have : 

Lemma:  Let  n and  Y be  given  positive  integers  with  0<y<_  n(n-l)  , 

and  let  LENGTHS  be  a finite  set  of  nonnegative  real  numbers.  Let 

G be  the  collection  of  all  weighted  cliques  on  n nodes  s.t.  for 

any  G e G,  there  is  a subset  A of  the  arcs  of  G satisfying 

G 

1)  |ag|  = Y 

2)  (i, j)e  aq  =>  £(i,j)e  lengths,  (i,j)$  aq  =>£(i,j)  = ® 

(Note  for  G,  G'  e G A is  not  necessarily  equal  to  A , but 
|Ag|=|A^,|.  Also  | LENGTHS | < “ =>  |G|<  °°).  Assign  the  (discrete) 
uniform  probability  measure  to  G.  Then  under  the  assumption  of 
random  tie  breaking  rules  for  sorting  arclength  lists  and  playing 


binary  trees,  Spira's  algorithm  uses  an  average  of  0(min(Y,n  logn) ) 
iterations  to  solve  the  single  source  problem  for  any  source. 

Proof;  The  assumptions  of  random  tie  breaking  rules  and  a uniform 
probability  measure  on  G imply  that  for  any  node  i,  all  possible  or- 
derings of  the  destinations  of  arcs  leaving  i ( where  the  ordering  is 
in  terms  of  arclengths)  are  equiprobable.  This  implies  IV. 2-1  holds 
and  the  n logn  part  follows  as  before.  Now,  observe  that  if  the 
length  of  the  shortest  one  arc  extension  from  a certain  node  is  in- 
finite, Spira's  algorithm  can  be  modified  to  consider  that  node  as 
being  "blocked".  In  particular,  on  a clique  of  n nodes  in  which  only  y 
arclengths  are  finite  Spira's  algorithm  need  only  perform  Y iterations. 
At  that  point  the  length  of  a shortest  path  to  each  node  that  has  not 
been  labelled  must  be  °°.  Thus  the  average  number  of  iterations  is 
0(min(Y/n  logn) ) . | | 

The  previous  lemma  effectively  embeds  the  collection  of  random 
graphs  with  Y arcs  and  random  arclengths  into  the  collection  of 
random  weighted  cliques  that  have  exactly  Y finite  arclengths. 
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IV. 3 Applications  to  Distributed  Algorithms 

We  now  present  a distributed  implementation  of  Spira's  algorithm. 

Our  discussion  will  only  informally  outline  the  basic  iteration  since  the 
details  of  the  initialization  and  data  structures  are  similar  to  those 
of  previous  algorithms.  Again,  NT(x)  denotes  the  neighbor  through  which 
the  source  has  a shortest  path  to  x,  T is  the  set  of  nodes  to  which  the 
source  has  not  found  shortest  paths,  and  for  a source  i,  CANDIDATES (i ,x) 
is  the  ordered  set  of  arclengths  of  x that  are  useful  or  potentially 
useful  to  i.  The  ordering,^  , on  CANDIDATES (i ,x)  is  defined  as  follows:  i(x,y) 
£(x,  z)  iff  either  £(x,y)>  £(x,z)  , or  £ (x,y)  =£(x, z)  and  i received  <L(x,y)  after 
it  received  £(x,z).  When  a node  initially  sorts  its  own  arclength  list, 
it  breaks  ties  randomly,  i.e.  if  ICx.x^)®. . .=£(x,x^) , then  x randomly 
(uniformly)  chooses  any  one  of  the  kl  possible  orderings. 

Algorithm  SP4 

Each  node  i executes : 

1.  Let  [i,...,r,x]  be  the  next  smallest  path  as  determined  by  playing 

the  binary  tree  of  path  distances.  (Hereafter,  we  will  say  that  node 
i examines  the  path  [i , . . . ,r,x) ) , 


2.  if  xeT 

then  do: 

a. 

T «-  T - {x} 

b. 

If  r=i  then  NT(x)«-x. 

c. 

Else  NT(x)  •*-  NT  (r) 

f 


j. 

i 

li 
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d.  Ask  nodes  NT (r)  and  NT(x)  for  the  next  smallest 
arclengths  of  r and  x respectively. 

e.  end 

3.  Else  do; 

a.  CANDIDATES ( i , r) ^CANDIDATES ( i , r)  - U(r,x)} 

b.  Ask  node  NT (r)  for  the  next  smallest  arc  out  of  r. 

c.  end 

4.  When  the  requested  arclength(s)  arrive,  add  it  (them)  to  the 
appropriate  CANDIDATE  list  (s) . If  there  are  any  outstanding 
requests  for  the  arclengths  for  r and/or  x,  send  the  appropriate 
information. 

5.  If  T=(J  then  stop.  Else  go  to  1. 

In  parallel,  node  i executes  the  following  communication  process: 

When  a neighbor  j requests  the  next  smallest  arclengths  of  k do: 
Suppose  £(k,x)  is  the  arclength  that  i most  recently  gave  j. 

a.  If  there  is  an  arclength  £(k,y)  in  CANDIDATES (i ,k)  s.t.  £(k,y)  ^ 

£(k,x)  , then  send  j the  smallest  (in  terms  of  ^ ) 

such  arclength  £(’*,}')  . 

b.  Otherwise,  record  the  fact  that  j has  requested  the  next 
smallest  arclength  of  k. 

Comment : Step  a)  of  the  communication  process  requires  that  node  i know 

the  arclength  £(k  x),  which  is  the  arclength  of  k that  i most  recently  sent 


to  j.  Either  node  i can  remember  this,  or  node  j can  supply  this 
information  as  part  of  the  request.  Also,  if  there  are  no  more  useful 
arcs  out  of  a certain  node,  then  the  next  smallest  arclength  is  effec- 
tively infinite,  and  node  i can  send  a message  to  this  effect. 

Deadlock  Problems: 

As  was  the  case  with  algorithm  SP3  (See  III. 3),  one  may  suspect  that 
algorithm  SP4  may  deadlock.  Again,  we  will  prove  that  if  all  arclengths 
are  positive,  then  for  any  cycle  • • »i  “i^l  suc*'‘  tflat  has  re“ 

quested  some  information  from  i^+^  • at  least  one  of  the  nodes  will  be 
able  to  supply  its  neighbor  with  the  requested  information.  That  this  is 
sufficient  to  guarantee  that  the  algorithm  is  deadlock  free  can  be  seen 
as  follows.  At  any  instant  of  (real)  time,  the  set  of  nodes  N can  be 
partitioned  into  three  subsets 

COMP  = {nodes  that  are  computing} 

WAIT  = {nodes  that  are  waiting  for  new  information} 

FIN  = {nodes  that  have  finished}. 

If  there  is  never  a time  when  the  nodes  that  are  in  N-FIN  are  all  waiting 
for  new  information  from  other  nodes  in  N-FIN,  then  the  fact  that 
Spira’s  algorithm  terminates  correctly  after  finitely  many  iterations 
together  with  the  fact  that  any  node  in  FIN  can  answer  any  request 
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without  waiting  for  more  information  implies  that  all  nodes  will  even- 
tually finish.  We  also  observe  that  if  node  i examines  some  path  [i,j] 
and  requests  the  next  arclengths , then  this  request  can  be  answered  with- 
out waiting  since  by  assumption  i and  j know  their  own  arclengths. 

Thus  a deadlock  can  occur  only  if  there  is  a time  when  COMP=$,  jWAIT|>^  2, 
and  each  node  i in  WAIT  is  waiting  to  receive  a new  arclength  of  r^  from 
NT(r^),  where  NT(r^)£  WAIT,  NT(r^)^i,  and  NT(r^)^\.  Now  since  |waIt)<°°, 
the  "pigeon  hole"  principle  implies  that  there  must  be  a cycle  of  requests 
[i^,...,is*i  ).  So  assuming  the  result  that  at  least  one  of  the  nodes  in 
the  cycle  can  answer,  we  conclude  that  at  least  one  of  the  nodes  in  the 

cycle  will  receive  the  information  it  requested  and  will  move  from  WAIT 

> 

to  COMP.  (We  note  that  the  previous  argument  is  valid  only  if  transmission 
delays  are  all  finite.  They  can  be  arbitrary,  but  they  must  be  finite). 

So  now  we  turn  to  proving  the  result  about  a cycle  of  requests. 

We  first  observe  that  in  algorithm  SP3 , (III. 3),  if  a node  i chooses 
a shortest  path  P [i ,x]  = [i ,NT (x) , . . . ,x]  from  itself  to  x,  then  node  NT(x) 
must  in  fact  choose  the  subpath  of  P[l,x]  from  itself  to  x as  a shortest 
path.  This  fact  can  be  proved  by  induction  on  the  number  of  arcs  in 
P[l,x],  and  we  say  that  algorithm  SP3,  computes  consistent  shortest  paths. 
However,  algorithm  SF4  need  not  compute  consistent  shortest  paths  if  ties 
are  broken  randomly.  Recall  that  this  random  tie  breaking  rule  was  used 
in  obtaining  the  average  time  bounds  of  section  IV. 2.  While  it  may  be 
possible  to  devise  a deterministic  tie  breaking  rule  which  insures  that 


a shortest  path  (not  necessarily  one  arc) 


‘ path  to  y through  2 
'*  path  to  z through  3 
f path  to  x through  1 
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1 transitions  into  the  WAIT  state  after  examining  the  path  through  y to  y 


2 
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ylc 

II 

N 

N3 

O 

It 

3c 

Now  because  1 receives  arclengths  of  y from  2 
" 2 z from  3 

" 3 " x from  1 


the  following  inequalities  must  hold 


£(x,x.  ) > £(x,x,  ) 
1C  — 3c 


A(y,y2c>  - 2(y'ylc} 


IV. 3-1 


S'(z'z3c)  1 i(z'z2c) 


We  claim  in  fact  that  at  least  one  of  the  inequalities  in  IV. 3-1  must  be 
a strict  inequality.  Suppose  that  they  are  all  equalities.  Then  because 
the  nodes  examine  paths  in  order  of  nondecreasing  length,  it  must  be  that 


d (1 ,x) 

+ 

£(x,x  )>  d(l,y) 

lc  — 

+ 

2(y'ylc) 

= d(l,y) 

+ 

My,y2c 

d(2,y) 

+ 

2,<Y'y2c)-  d(2,zJ 

+ 

2(z,Z2c) 

= d(2,z) 

+ 

2 (2  ' Z3c 

d(3,z) 

+ 

£(z,z  )>  d(3,x) 

oc  — 

+ 

£(x,x3c) 

- d(3,x) 

+ 

£(x,Xio 

IV. 3-2 


where  d(i,j)  is  the  shortest  distance  from  i to  j . Now  if  all  arclengths 
are  positive,  the  fact  that  1 (2,3)  has  a shortest  path  to  y(z,x)  through 
2 (3,1)  implies 


V! 

i 


f 


d(l,y)>  d(2  ,y) 


d(2,z)>  d(3,z) 


IV. 3-3 


d (3  ,x)  > d (1  ,x) 


Combining  IV.  3-2  and  IV.  3-3  we  obtain  £(x,x.  )>  £(x,x,  ).  This  is  a 

lc  lc 

contradiction,  and  so  we  conclude  that  one  of  the  inequalities  in  IV. 3-1 

must  be  strict.  Without  loss  of  generality  assume  £(x,x,  )>  £(x,x„  ). 

lc  3c 

Then  this  means  that  node  1 does  have  a new  arclength  of  x to  give  to  3. 

We  are  not  quite  done,  however,  because  if  node  3 determines  that  the 
path  from  itself  through  1 to  x and  x„  is  a new  shortest  path,  it  also 
needs  a new  arclength  out  of  x^.  The  fact  that  node  1 has  an  arclength 
£(x,Xic>  s.t  x j*  x3c  implies  that  node  1 examined  the  path  from  itself 
through  x to  x^  on  a previous  iteration.  If  the  path  pt3»x3cJ  examined 
by  3 is  a shortest  path  for  3,  the  optimality  principle  implies  that  the 
subpaths  3 of  P[3,x3c]  that  go  from  1 to  x and  x must  also  be  shortest 
paths.  Now  even  if  l’s  shortest  path  to  x,  P[l,x],  is  not  a subpath  of 
P^'X3c^'  ^ must  have  the  same  length  as  the  subpath  of  P[3,x3<J  that 
goes  from  1 to  x.  Hence  the  path  P[l,x  ] = (P[l,x]  followed  by 

JC 

tx,x,  ])  must  be  a shortest  path.  When  1 examined  P[l,x„  ] either 

JC  3C 

a)  it  already  had  another  shortest  path  to  x or 

3c 

b)  it  determined  PlljX^^]  is  a shortest  path  and  waited  for  an 

arclength  of  x„  before  doing  its  next  iteration. 

3c 

In  either  case,  node  1 must  now  also  now  have  2m  arclength  of  x3<_.  Thus 
node  1 can  supply  node  3 with  all  the  information  that  3 needs  to  perform 
its  next  iteration. I I 


Remarks: 


a)  One  can  now  see  why  it  is  not  necessary  for  algorithm  SP4  to 
compute  consistent  shortest  paths  in  order  to  be  deadlock  free. 
Even  if  1 does  not  choose  a shortest  path  to  x that  is  consistent 
with  the  shortest  path  to  x choosen  by  3,  the  strict  inequality 
d (3,x) > d (1 ,x)  must  still  hold  if  £(3,1)>  0. 

b)  The  previous  proof  works  even  with  zero  length  arcs  as 
long  as  there  are  no  zero  length  cycles . 

Worst  Case  Communication  Cost: 

Each  node  requests  and  receives  each  arclength  at  most  once. 
Therefore,  CC(SP4)£  2Ln [arclength  messages  + request  messages].  The 
exact  bit  cost  of  a message  depends  on  the  details  of  the  implemen- 
tation (see  comment  after  description  of  SP4) . In  any  case,  the  cost 
of  each  message  will  be  0 (logn  + log(2.-max))  bits. 

Application  of  Results  of  IV. 2 

There  is  an  almost  exact  correspondence  between  the  number  of 
arclengths  a node  receives  and  the  number  of  iterations  of  Spira's 
algorithm  it  makes.  (More  precisely,  # arclengths  received  £ # 
iterations  + n.  A node  may  receive  some  arclengths  but  finish  before 
it  uses  them.)  Therefore,  the  results  of  Section  IV. 2 apply  to  the 
average  communication  cost.  However,  one  must  be  careful  when  inter- 
preting their  significance,  and  there  are  several  points  that  merit 
elaboration. 
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The  first  result  of  IV. 2 stated  that  for  a clique  of  n nodes  with 
i.i.d.  random  arclengths,  a source  performs  O(nlogn)  iterations  (average), 
showed,  though,  that  this  need  not  apply  to  an  arbitrary  graph.  Since 
many  networks  are  not  cliques  (in  fact  connecting  every  pair  of  nodes 
by  a link  is  not  only  often  economically  unfeasible,  but  also  defeats 
the  purpose  of  having  a network  to  begin  with) , this  result  is  of  limited 
practical  value.  However,  we  observe  that  our  counterexample  relied 
heavily  on  a graph  with  0 (n)  nodes  of  large  outdegree  and  small  indegree. 
It  is  not  clear  if  a similar  counterexample  exists  for  symmetric , 
connected  graphs  (data  networks) . 

The  second  result  stated  that  if  we  average  over  random  digraphs  with 

f 

Y arcs,  then  a similar  O(nlogn)  average  case  bound  holds.  This  result 
is  also  of  limited  practical  value.  The  class  of  random  digraphs  with 

Y arcs  includes  many  graphs  that  are  not  symmetric.  In  fact,  the  sym- 
metric digraphs  comprise  approximately  only  the  fraction 

_,.n(n-l)  Yn 
B( — 2 — ' 2' 

— ” " I 

B(n(n-1)  ,Y) 

where  B(n,k)  is  the  binomial  coefficient  on  n,k,  of  the  total  number  of 
digraphs  on  Y arcs.  For  large  n and  Y this  is  a very  small  number. 

Thus,  it  is  not  clear  if  the  same  average  case  bound  holds  for  the  smaller 
class  of  symmetric  graphs,  and  it  appears  to  be  more  difficult  to  compute 
the  average  in  this  case.  Our  proof  of  the  bound  for  general  digraphs 
used  the  fact  that  all  possible  orderings  (in  terms  of  arclengths)  of  the 


We 


destinations  of  arcs  leaving  a node  aye  equiprobable . In  particular, 
knowing  that  JZ.(i,j)<°°  tells  us  nothing  about  the  value  of  ?.(j,i) 
relative  to  other  arclengths  of  j.  If  we  attempt  to  extend  this  result, 
by  embedding  the  class  of  random  symmetric  digraphs  with  y arcs  into  the 
class  of  random  cliques  with  y finite  arclengths  s.t.  £(r,s)<°°=>  £(s,r)< 
given  that  l (i , j ) < , it  is  not  clear  that  all  orderings  of  arclengths  of 
j are  equiprobable  since  £(j,i)  must  also  be  finite. 

Even  if  we  were  able  to  obtain  an  average  case  bound  for  random 
symmetric  graphs,  the  result  would  be  of  limited  practical  significance 
because  the  topology  of  a network  remains  relatively  fixed  while  many 
shortest  path  updates  are  performed.  The  significant  averaging  occurs 
only  over  the  arclength  ensemble. 

We  now  outline  a distributed  algorithm  that  exploits  the  properties 
of  statistical  averaging  over  the  arclengths  only,  in  any  fixed  topology. 
The  procedure  will  make  use  of  candidate  bit  vectors,  and  so  we  again 
assume  that  each  adjacency  list  is  assigned  some  order,  and  all  nodes 
know  all  ordered  adjacency  lists.  (It  is  important  to  realize  that  the 
orderings  of  the  adjacency  lists  are  entirely  arbitrary  and  must  be 
agreed  upon  by  all  nodes  only  once  beforehand.  Their  sole  purpose  is  to 
facilitate  the  use  of  bit  vectors,  and  they  can  be  repeatedly  used  on 
succesive  shortest  path  updates.) 

On  each  iteration,  a source  node  i examines  a path  P[i,x],  If 
P[i,x]  is  a new  shortest  path,  we  term  the  event  a success . So  given 
that  node  i has  found  shortest  paths  to  nodes  the 


-83- 


(conditional)  probability  of  success  can  be  expressed  as 

j 

Prtsuccess]  = £ Pr [p[i,x]-*  n ]Pr[success|  P[i,x]-*- n ) 

s=l  S s 

where  P[i,x]-*-  ng  means  that  the  path  P[i,x]  is  a one  arc  extension  of  the 

known  shortest  path  to  n^.  In  general,  the  probabilities  in  each  term  of 

the  sum  depend  upon  the  topology  and  the  particular  nodes  , • , n ^ , and 

hence  appear  difficult  to  compute.  However,  whatever  the  probabilities 

{Pr  [p[i  ,x]-*  n } are , they  sum  to  one , and  thus , if  we  are  able  to  lower 
s 

bound  Prtsuccess | P[i,x]  + ng]  by  a constant  a that  is  independent  of  £, 
we  obtain  Pr [success] a . Our  proof  of  the  first  lemma  in  IV. 2,  in  fact, 
used  such  a lower  bound  a = . 


Now  we  will  use  candidate  bit  vectors  to  insure  that  whenever  a node 
i receives  the  next  arclength  of  a node  j,  the  probability  that  this  arc 
leads  to  a new  node  is  j most  of  the  time . Because  of  deadlock 
problems  that  will  be  discussed  later,  node  i will  have  to  communicate 


directly  with  node  j via  a minimum  hop  path  (rather  than  communicate  with 
node  NT(j)  as  in  algorithms  SP3  and  SP4)  when  i wants  the  next  smallest 


arclengths  of  j . In  order  to  use  bit  vectors , each  node  j must  also 
maintain  a list  A„  of  those  nodes  in  AL ( j ) which  j knows  i has  labelled, 
for  each  i/j.  (Node  j can  do  this  with  n|AL(j)[  bits  of  storage.)  Note 
that  node  j may  not  have  to  know  all  nodes  i has  labelled  since  i will 
only  ask  j for  arclengths  of  j . 


1 


So  the  basic  procedure  is  to  have  each  node  i execute  Spira's 
algorithm  and  request  new  arclengths  directly  via  minimum  hop  paths. 
However,  whenever  i makes  a request  it  does  the  following: 

Suppose  i examines  a path  P[i,x]  = [i,...,r,x]  and  asks  node  r for 
its  next  best  arclength.  If  more  than  half  the-  nodes  in  AL(r)-A^  are 
unlabelled  by  i,  i sends  the  request  as  is.  Otherwise  i sends  r a bit 
vector  indicating  which  nodes  in  AL(r)  -A^  , i has  labelled  since  it 
last  gave  r a bit  vector.  (Node  i does  similarly  with  if  it  also  needs 
an  arclength  from  x.)  Node  r will  respond  with  the  smallest  arclergth 
£(r,t)  s.t.  t e AL(r)-A.  . 


Let  us  analyze  the  average  communication  cost  under  the  assumption 
that  for  each  node,  all  possible  orderings  of  the  destinations  of  arcs 
leaving  that  node  (ordering  is  by  arclength  with  random  tie  breaking 
rules)  are  equiprobable  and  independent  of  orderinqs  at  other  nodes. 


a)  Bits  in  bit  vectors : 

Each  time  i sends  r a bit  vector,  r learns  that  i has  labelled 

at  least  half  the  nodes  that  remain  in  AL(r)  -A'  Thus  i sends 

ir 

r at  most 

|AL(r)|  + + -^r)  l +...+  1 = 2|AL(r)  | 

bits  in  bit  vectors,  and  so  i sends  at  most  2 Z |AL(r) I = 4L 
r 1 

bits  in  total  for  bit  vectors. 
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b)  # of  arclengths  received  by  i: 

As  previously  mentioned, (#  arclengths  received  by  i)£  (#  iterations 
of  Spira's  algorithm  i performs  + n) . Thus  we  compute  the  average 
# of  iterations.  Consider  the  computation  phase  of  some  iteration, 
say  m,  in  which  i examines  a path  [i,...,r,x].  Node  i received 
the  arclength  £,( r,x)  from  r during  the  communication  phase  of  some 
previous  iteration  m'  < m . (Iteration  m'  is  the  most  recent  ite- 
ration prior  to  m in  which  i examined  a path  that  is  a one  arc 
extension  from  r.)  Now  when  i received  £.(r,x),  x was  equally  likely 

to  be  any  node  in  AL(r)  -A  . The  independence  assumption  implies, 

ir 

however,  that  x is  still  equally  likely  to  be  any  node  in 

AL (r)  -A'  at  the  beginning  of  the  m — iteration  (note:  A.  at 
ir  ir 

the  beginning  of  iteration  m is  the  same  as  A^^  at  the  end  of 
iteration  m' ) regardless  of  which  other  paths  i examined  in  between 
iterations  m'  and  m.  What  has  changed  of  course  is  the  probability 
that  x is  an  unlabelled  node.  Because  of  the  use  of  bit  vectors 


though,  there  can  be  at  most  | log2  | AL(r)"f| 


Re 


iterations  involving 


r for  which  Fr[x  is  unlabelled]  is  £ — . Since  this  holds  for  any 


r,  there  can  be  at  most 


P Rog2|AL(r)Tl 


iterations  for 


r*i 


which  Pr [success]  is  £ — . For  other  iterations,  i finds  a new 
shortest  path  with  probability  . Hence  the  average  number  of 
these  other  iterations  is  £ 2n.  Combining  all  this  with  the  fact 
that  # arclengths  received  < # iterations  + n,  we  conclude  that  i 


receives  on  the  average  at  most 


2^(7 


log2|AL(r)|]  + 3n  arclengths. 


Since  this  analysis  holds  for  any  source  i and  since  every  message 
or  bit  vector  traverses  a minimum  hop  path  having  at  most  D(G)  arcs 
we  obtain  an  average  total  communication  cost  upperbounded  by 


4Ln  D (G)  bits  + nD(G)  I 


S[log  |AL(r)"j"[  + 


3n] 


[arclength  messages  + request  messages] • 


Because  log  is  a convex  H function,  £[log2|AL(r)  71 1 ■(  log2  f * l) 
and  so  the  expression  further  simplifies  to 


4Ln  D(G)  + n2D(G)[  log2  — + 3] . 

The  reader  may  now  ask  why  this  procedure  cannot  be  modified  so 
that  node  i communicates  with  node  NT (r ) for  the  arclengths  of  r,  and 
thereby  eliminate  the  D(G)  factor.  He  or  she  can  be  certain  that  we  have 
tried,  but  as  previously  indicated,  a serious  deadlock  problem  arises. 


We  describe  the  problem  by  means  of  another  three  node  example: 


I 

I 
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The  above  example  is  identical  to  the  previous  example  of  this  section 

with  the  exceptions  of  three  new  nodes  x,  , y,  , and  z„  and  the 

3n  In  2n 

corresponding  arcs.  Now,  unfortunately,  one  can  show  that  the  following 
can  occur: 

Node  1 examines  a path  [1 ,2 , . . . ,y ,y_  ] and  asks  2 for  the  next 

lc 

smallest  arc  of  y.  Node  2 has  a new  arc,  i.e.  y„  / y,  , but  node  1 

2c  •'lc  

already  has  a shortest  path  to  y^  (this  path  is  not  shown) , and  the 
next  arc  of  £ which  is  useful  to  1^  is  in  fact  (y,y^)  whose  length  node  2 
does  not  yet  have.  Analogous  situations  can  also  hold  for  (2,3,z)  and 
(3.1,x),  and  a deadlock  exists. 

Our  previous  result  on  cycles  of  requests  for  algorithm  SP4  only 


guarantees  that  at  least  one  cf  y^  , z„  , x,  must  be  different  from 

2c  3c  lc 

one  of  y^,  Z2c'  X3c  resPect^ve^y • does  not  guarantee  that  one  of 
the  nodes  has  a new  arc  that  is  also  potentially  useful  to  the  reques- 
ting nodes.  Were  we  able  to  eliminate  the  D(G)  factor  (by  some  yet 

unknown  technique) , we  would  obtain  an  algorithm  that  uses 
2 L 

O(Ln)  bits  + 0 (n  log  — ) arclength  messages  of  communication  on  the 
average,  and  this  represents  at  least  a modest  improvement  over  our 
O(Ln)  arclength  messages  algorithms. 
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V.  SUGGESTIONS  FOR  FURTHER  RESEARCH 

1.  New  Algorithms  and  Better  Analysis  of  Existing  Algorithm 

It  would  be  satisfying  to  find  a distributed  shortest  path  algorithm 
which  uses  asymptotically  less  communication  than  algorithm  BAAD 
(broadcasting  all  arclengths  to  all  destinations).  Thus  far,  our  ap- 
proach has  been  to  decompose  the  problem  into  n interacting  single  source 
problems,  and  our  single  source  algorithms  have  in  some  sense  mimicked 
centralized  algorithms  (though  nontrivial  deadlock  questions  arise  if 
one  attempts  to  devise  intelligent  interactions).  It  is  certainly  pos- 
sible that  an  entirely  new  approach  will  be  needed  in  order  to  beat  the 
O(Ln)  arclengths  bound.  The  reader  should  be  aware  that  centralized 
lower  bounds  apply  to  distributed  algorithms  which  can  be  simulated  on 
one  computer,  in  the  following  sense.  Suppose  the  distributed  algorithm 
uses  0 (X)  computations  + 0 (Y)  communications , and  that  it  can  be  simulated 
in  such  a way  that  one  communication  requires  only  a constant  number 
of  computations  in  the  simulation.  Then  if  X is  less  than  some  cen- 
tralized lower  bound,  Y must  be  greater  than  that  bound.  Thus  the 
basic  idea  is  to  decrease  Y at  the  expense  of  increasing,  X,  i.e.  make 
the  nodes  "smarter".  In  some  sense,  candidate  bit  vectors (III. 5)  do 
just  that,  because  a node  may  use  0(n)  computations  in  interpreting  the 
meaning  of  one  bit  in  a bit  vector. 

It  would  also  be  nice  to  have  better  analyses  of  existing  algorithms. 
In  particular,  can  the  effectiveness  of  the  pruning  heuristic  of 
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algorithm  SP3  be  more  precisely  characterized,  and  can  one  find 
average  case  bounds  similar  to  those  of  IV. 2 for  symmetric  connected 
graphs? 

2.  Simulation 

If  tighter  analytical  bounds  are  not  obtainable,  simulations  may 
provide  some  insight  into  the  average  behavior  of  existing  algorithms 
and  into  the  construction  of  worst  case  examples  for  the  pruning 
heuristic. 

3.  Models  of  Distributed  Algorithms  and  Lower  Bounds 

Since  a lower  bound  is  only  valid  within  the  context  of  some 
model,  we  first  need  to  develop  reasonable  models  of  the  computation  - 
communication  processes  of  distributed  algorithms  before  searching  for 
such  bounds.  This  appears  to  be  a difficult  task.  However,  we  sus- 
pect that  an  0(Ln)  "somethings"  worst  case  lower  bound  does  exist,  where 
the  "somethings"  (perhaps  bits)  is  to  be  determined  by  the  model. 
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APPENDIX  A 

Horst  Case  Examples  of  Algorithm  SP1  (III. 4) 
I.  This  shows  CC(SPl)  is  not  0(n3) 


J 


and  fill  in  other  arcs  analogously,  i.e.  i(i,i+l)»l,  i(i,i+2)=3 
i(i,i+3)*5,  etc. 

By  symmetry,  the  total  communication  cost  is  just  n times  the 
number  of  distances  received  by  node  1.  A little  work  will  show  that 
node  1 will  hear  i times  about  node  k from  node  k-i,  for  i<k.  Thus 

2 3 

node  1 hears  0 (k  ) times  about  node  k and  so  receives  0 (n  ) distances 

4 

Bence  the  total  cost  is  0 (n  ) . 
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